Search A Proteome

Advanced Proteome Searching

Use a PoPS specificity model to search an entire proteome for possible targets of your protease. Use the advanced options to limit the output. Is your favourite organism missing from this page? Send an email request to Sarah Boyd (Sarah.Boyd@Infotech.monash.edu.au). Alternatively, use this page to search for targets in a fasta file.

Please note that results are only kept on the server for 72 hours.


Your email address, where the results will be sent:

The protease specificity model:

Select the organism (proteome):
Homo sapiens (human): 29,572 proteins
Saccharomyces cerevisiae (baker's yeast): 5,877 proteins
Escherichia coli K12: 10,667 proteins
Drosophila melanogaster (fruit fly): 19,620 proteins
Arabidopsis thaliana (thale cress): 30,622 proteins
Rattus norvegicus (Norway rat): 24,115 proteins
Mus musculus (house mouse): 56,950 proteins
Danio rerio (zebrafish): 30,556 proteins
Plasmodium falciparum (malaria): 5,270 proteins
Rhadinovirus Human herpesvirus 8: 82 proteins
Schistosoma mansoni (Blood fluke): 12 proteins

Derived from the RefSeq database (NCBI): last updated on the 26th March 2006.

Select the preferred score threshold option:
Note: reasoning tables are limited to the top 5 sites in each protein, regardless of the score threshold. See here for more details.
Set the score threshold value to (any valid floating point number)
Set the score threshold to the maximum score in each substrate
Set the score threshold to the minimum score (all predictions in each substrate except -Infinity)

Advanced Options


(Click here to skip the advanced options and go directly to the form submission).

Limit the output according to predicted accessibility:
Note: Only the top five most closely related structures are used.

Set the accessibility thresholds:
The default minimum percentage solvent accessibility is set to 33% for each subsite. To use different values, specify the minimum percentage solvent accessibility for each subsite, separated by commas (e.g. 33, 15, 28, ...):

Only return predictions that have at least:
Exclude predictions that have at least residue(s) predicted as buried in at least of the structures returned.
treat residues of unknown accessibility as accessible.
treat unknown structures as accessible.

Limit the output according to secondary structure:
Note: Only the top five most closely related structures are used. See below for the DSSP code.

Only return predictions that have at least:
residue(s) predicted as unstructured in at least structure(s):
treat residues of unknown structure as unstructured.
treat unknown structures as unstructured.

Exclude predictions that have at least:
residue(s) predicted as G in at least structure(s).
residue(s) predicted as H in at least structure(s).
residue(s) predicted as I in at least structure(s).
residue(s) predicted as T in at least structure(s).
residue(s) predicted as E in at least structure(s).
residue(s) predicted asc B in at least structure(s).
residue(s) predicted as S in at least structure(s).
residue(s) predicted as ? in at least structure(s).
 
For each of these selections:
remove the prediction if the number of specified structures is not available.

Limit the output according to the number of predicted sites:
Use the following options to exclude proteins from the results that have more than a specific number of predicted cleavage sites:

No Limit (default)
Exclude substrates containing more than predicted cleavage sites.


The DSSP code.

You can read more about DSSP on Wikipedia. The DSSP code is as follows:

  • G = 3-turn helix (3_10 helix). Minimum length 3 residues.
  • H = 4-turn helix (alpha helix). Minimum length 4 residues.
  • I = 5-turn helix (pi helix). Minimum length 5 residues.
  • T = hydrogen bonded turn (3, 4 or 5 turn)
  • E = beta sheet in parallel and/or anti-parallel sheet conformation (extended strand). Minimum length 2 residues.
  • B = residue in isolated beta-bridge (single pair beta-sheet hydrogen bond formation)
  • S = bend (the only non-hydrogen-bond based assignment)
  • . = no regular secondary structure

In addition, PoPS uses the symbol '-' to represent residues where the PDB structure doesn't align, and blank spaces (' ') to represent where no structure is available.