Signal peptide prediction based on analysis of experimentally verified cleavage sites.

Article Details


Zhang Z, Henzel WJ

Signal peptide prediction based on analysis of experimentally verified cleavage sites.

Protein Sci. 2004 Oct;13(10):2819-24. Epub 2004 Aug 31.

PubMed ID
15340161 [ View in PubMed

A number of computational tools are available for detecting signal peptides, but their abilities to locate the signal peptide cleavage sites vary significantly and are often less than satisfactory. We characterized a set of 270 secreted recombinant human proteins by automated Edman analysis and used the verified cleavage sites to evaluate the success rate of a number of computational prediction programs. An examination of the frequency of amino acid in the N-terminal region of the data set showed a preference of proline and glutamine but a bias against tyrosine. The data set was compared to the SWISS-PROT database and revealed a high percentage of discrepancies with cleavage site annotations that were computationally generated. The best program for predicting signal sequences was found to be SignalP 2.0-NN with an accuracy of 78.1% for cleavage site recognition. The new data set can be utilized for refining prediction algorithms, and we have built an improved version of profile hidden Markov model for signal peptides based on the new data.

DrugBank Data that Cites this Article

NameUniProt ID
Vascular endothelial growth factor receptor 3P35916Details
Vascular endothelial growth factor AP15692Details
Platelet-derived growth factor receptor betaP09619Details
Calcitonin receptorP30988Details
High affinity immunoglobulin gamma Fc receptor IP12314Details
Epidermal growth factor receptorP00533Details
Cocaine- and amphetamine-regulated transcript proteinQ16568Details
Platelet basic proteinP02775Details
Tissue factor pathway inhibitorP10646Details
Insulin-like growth factor-binding protein 3P17936Details
Vascular cell adhesion protein 1P19320Details
Protein NOV homologP48745Details
Dipeptidase 1P16444Details
Tumor necrosis factor receptor superfamily member 5P25942Details
Fibroblast growth factor receptor 4P22455Details
TGF-beta receptor type-2P37173Details
Insulin-like growth factor-binding protein 7Q16270Details
Thioredoxin domain-containing protein 12O95881Details
Carbonic anhydrase 9Q16790Details
Toll-like receptor 4O00206Details
Parathyroid hormoneP01270Details
Angiopoietin-1 receptorQ02763Details
Prostate stem cell antigenO43653Details
Tumor necrosis factor receptor superfamily member 9Q07011Details
T-cell surface glycoprotein CD4P01730Details
SLAM family member 7Q9NQ25Details
Retinoid-inducible serine carboxypeptidaseQ9HB40Details
Lactase-like proteinQ6UWM7Details
Serine protease inhibitor Kazal-type 6Q6UWN8Details
L-amino-acid oxidaseQ96RQ9Details
N-acetylmuramoyl-L-alanine amidaseQ96PD5Details
C-type lectin domain family 14 member AQ86T13Details
Fibroblast growth factor 19O95750Details
Trefoil factor 1P04155Details
Prostate-specific antigenP07288Details
Calcitonin gene-related peptide type 1 receptorQ16602Details
Tyrosine-protein kinase receptor Tie-1P35590Details
Ephrin type-B receptor 1P54762Details
Ephrin type-B receptor 6O15197Details
Zinc transporter ZIP6Q13433Details
Angiopoietin-related protein 3Q9Y5C1Details
CD27 antigenP26842Details
Vascular endothelial growth factor CP49767Details
CD276 antigenQ5ZPR3Details
IgG receptor FcRn large subunit p51P55899Details