Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.

Article Details


Kelkar DS, Kumar D, Kumar P, Balakrishnan L, Muthusamy B, Yadav AK, Shrivastava P, Marimuthu A, Anand S, Sundaram H, Kingsbury R, Harsha HC, Nair B, Prasad TS, Chauhan DS, Katoch K, Katoch VM, Kumar P, Chaerkady R, Ramachandran S, Dash D, Pandey A

Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.

Mol Cell Proteomics. 2011 Dec;10(12):M111.011627. doi: 10.1074/mcp.M111.011445. Epub 2011 Oct 3.

PubMed ID
21969609 [ View in PubMed

The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ~250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes.

DrugBank Data that Cites this Article

NameUniProt ID
Possible cellulase CelA1 (Endoglucanase) (Endo-1,4-beta-glucanase) (FI-cmcase) (Carboxymethyl cellulase)Q79G13Details
Cyclopropane mycolic acid synthase MmaA2Q79FX6Details
Conserved proteinO53240Details
Hydroxymycolate synthase MmaA4Q79FX8Details
DNA-directed RNA polymerase subunit beta'P9WGY7Details
Probable arabinosyltransferase AP9WNL9Details
2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinaseP9WNC7Details
Probable arabinosyltransferase BP9WNL7Details
Enoyl-[acyl-carrier-protein] reductase [NADH]P9WGR1Details
Probable arabinosyltransferase CP9WNL5Details
Cyclopropane mycolic acid synthase 2P9WPB5Details
Thymidylate kinaseP9WKE1Details
6,7-dimethyl-8-ribityllumazine synthaseP9WHE9Details
Deoxyuridine 5'-triphosphate nucleotidohydrolaseP9WNS5Details
Nicotinate-nucleotide pyrophosphorylase [carboxylating]P9WJJ7Details
Cell division protein FtsZP9WN95Details
NADPH-ferredoxin reductase FprAP9WIQ3Details
Pantothenate synthetaseP9WIL5Details
Purine nucleoside phosphorylaseP9WP01Details
Serine/threonine-protein kinase PknBP9WI81Details
3-alpha-(or 20-beta)-hydroxysteroid dehydrogenaseP9WGT1Details
Diacylglycerol acyltransferase/mycolyltransferase Ag85CP9WQN9Details
Putative 4-hydroxy-4-methyl-2-oxoglutarate aldolaseP9WGY3Details
3-oxoacyl-[acyl-carrier-protein] synthase 3P9WNG3Details
2-isopropylmalate synthaseP9WQB3Details
Malate synthase GP9WK17Details
Isocitrate lyaseP9WKK7Details
3-dehydroquinate dehydrataseP9WPX7Details
Mycocyclosin synthaseP9WPP7Details
Dihydrofolate reductaseP9WNX1Details
Dihydropteroate synthaseP9WND1Details
4-hydroxy-tetrahydrodipicolinate reductaseP9WP23Details
D-3-phosphoglycerate dehydrogenaseP9WNX3Details
Cyclopropane mycolic acid synthase 3P9WPB3Details
Glutamine synthetaseP9WN39Details
Group 1 truncated hemoglobin GlbNP9WN25Details
Aminoglycoside 2'-N-acetyltransferaseP9WQG9Details
Inositol-3-phosphate synthaseP9WKI1Details
Citrate lyase subunit beta-like proteinP9WPE1Details
Guanylate kinaseP9WKE9Details
Mycothiol acetyltransferaseP9WJM7Details
1,4-dihydroxy-2-naphthoyl-CoA synthaseP9WNP5Details
HTH-type transcriptional regulator EthRP9WMC1Details
Ribose-5-phosphate isomerase BP9WKD7Details
NAD(P)H dehydrogenase (quinone)P9WHH7Details
UDP-galactopyranose mutaseP9WIQ1Details
Probable thiol peroxidaseP9WG35Details
dTDP-4-dehydrorhamnose 3,5-epimeraseP9WH11Details
Proteasome subunit alphaP9WHU1Details
Cyclopropane mycolic acid synthase 1P9WPB7Details
Serine/threonine-protein kinase PknGP9WI73Details
ATP-dependent dethiobiotin synthetase BioDP9WPQ5Details
Cytochrome P450 130P9WPN5Details
Methionine aminopeptidase 2P9WK19Details
(2Z,6E)-farnesyl diphosphate synthaseP9WFF5Details
Decaprenyl diphosphate synthaseP9WFF7Details
4,5:9,10-diseco-3-hydroxy-5,9,17-trioxoandrosta-1(10),2-diene-4-oate hydrolaseP9WNH5Details
Probable L-lysine-epsilon aminotransferaseP9WQ77Details
Proteasome subunit betaP9WHT9Details
R2-like ligand binding oxidaseP9WH69Details
Peptide deformylaseP9WIJ3Details
Iron-dependent extradiol dioxygenaseP9WNW7Details
3-oxoacyl-[acyl-carrier-protein] synthase 1P9WQD9Details
Secreted chorismate mutaseP9WIB9Details
Arylamine N-acetyltransferaseP9WJI5Details
Deazaflavin-dependent nitroreductaseP9WP15Details
Nucleoid-associated protein Lsr2P9WIP7Details
Uncharacterized MFS-type transporter EfpAP9WJY5Details
16S/23S rRNA (cytidine-2'-O)-methyltransferase TlyAP9WJ63Details
Probable fatty acid synthase Fas (Fatty acid synthetase)P95029Details