Signal peptide prediction based on analysis of experimentally verified cleavage sites.
Article Details
- CitationCopy to clipboard
Zhang Z, Henzel WJ
Signal peptide prediction based on analysis of experimentally verified cleavage sites.
Protein Sci. 2004 Oct;13(10):2819-24. Epub 2004 Aug 31.
- PubMed ID
- 15340161 [ View in PubMed]
- Abstract
A number of computational tools are available for detecting signal peptides, but their abilities to locate the signal peptide cleavage sites vary significantly and are often less than satisfactory. We characterized a set of 270 secreted recombinant human proteins by automated Edman analysis and used the verified cleavage sites to evaluate the success rate of a number of computational prediction programs. An examination of the frequency of amino acid in the N-terminal region of the data set showed a preference of proline and glutamine but a bias against tyrosine. The data set was compared to the SWISS-PROT database and revealed a high percentage of discrepancies with cleavage site annotations that were computationally generated. The best program for predicting signal sequences was found to be SignalP 2.0-NN with an accuracy of 78.1% for cleavage site recognition. The new data set can be utilized for refining prediction algorithms, and we have built an improved version of profile hidden Markov model for signal peptides based on the new data.
DrugBank Data that Cites this Article
- Polypeptides
Name UniProt ID Vascular endothelial growth factor receptor 3 P35916 Details Vascular endothelial growth factor A P15692 Details Platelet-derived growth factor receptor beta P09619 Details Calcitonin receptor P30988 Details High affinity immunoglobulin gamma Fc receptor I P12314 Details Epidermal growth factor receptor P00533 Details Cocaine- and amphetamine-regulated transcript protein Q16568 Details Lithostathine-1-alpha P05451 Details Platelet basic protein P02775 Details Pro-opiomelanocortin P01189 Details Tissue factor pathway inhibitor P10646 Details Insulin-like growth factor-binding protein 3 P17936 Details Vascular cell adhesion protein 1 P19320 Details Protein NOV homolog P48745 Details Dipeptidase 1 P16444 Details Tumor necrosis factor receptor superfamily member 5 P25942 Details Fibroblast growth factor receptor 4 P22455 Details TGF-beta receptor type-2 P37173 Details Insulin-like growth factor-binding protein 7 Q16270 Details Thioredoxin domain-containing protein 12 O95881 Details Carbonic anhydrase 9 Q16790 Details Toll-like receptor 4 O00206 Details Interleukin-10 P22301 Details Zinc-alpha-2-glycoprotein P25311 Details Parathyroid hormone P01270 Details Angiopoietin-1 receptor Q02763 Details Prostate stem cell antigen O43653 Details Promotilin P12872 Details Tumor necrosis factor receptor superfamily member 9 Q07011 Details T-cell surface glycoprotein CD4 P01730 Details SLAM family member 7 Q9NQ25 Details Retinoid-inducible serine carboxypeptidase Q9HB40 Details Lactase-like protein Q6UWM7 Details Follistatin P19883 Details Serine protease inhibitor Kazal-type 6 Q6UWN8 Details L-amino-acid oxidase Q96RQ9 Details N-acetylmuramoyl-L-alanine amidase Q96PD5 Details C-type lectin domain family 14 member A Q86T13 Details Fibroblast growth factor 19 O95750 Details Interleukin-17A Q16552 Details Trefoil factor 1 P04155 Details Prostate-specific antigen P07288 Details Calcitonin gene-related peptide type 1 receptor Q16602 Details Dermcidin P81605 Details Ficolin-3 O75636 Details Tyrosine-protein kinase receptor Tie-1 P35590 Details Ephrin type-B receptor 1 P54762 Details Ephrin type-B receptor 6 O15197 Details Sclerostin Q9BQB4 Details Zinc transporter ZIP6 Q13433 Details Angiopoietin-related protein 3 Q9Y5C1 Details CD27 antigen P26842 Details Vascular endothelial growth factor C P49767 Details CD276 antigen Q5ZPR3 Details IgG receptor FcRn large subunit p51 P55899 Details Interleukin-17F Q96PD4 Details