The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ϳ80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ϳ250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes.
The PIK3CA gene is frequently mutated in human cancers. Here we carry out a SILAC-based quantitative phosphoproteomic analysis using isogenic knockin cell lines containing ‘driver’ oncogenic mutations of PIK3CA to dissect the signaling mechanisms responsible for oncogenic phenotypes induced by mutant PIK3CA. From 8,075 unique phosphopeptides identified, we observe that aberrant activation of PI3K pathway leads to increased phosphorylation of a surprisingly wide variety of kinases and downstream signaling networks. Here, by integrating phosphoproteomic data with human protein microarray-based AKT1 kinase assays, we discover and validate six novel AKT1 substrates, including cortactin. Through mutagenesis studies, we demonstrate that phosphorylation of cortactin by AKT1 is important for mutant PI3K enhanced cell migration and invasion. Our study describes a quantitative and global approach for identifying mutation-specific signaling events and for discovering novel signaling molecules as readouts of pathway activation or potential therapeutic targets.
Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by non-expert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: 1) generation of customized, annotated protein sequence databases from RNA-Seq data; and 2) accurate matching of tandem mass spectrometry data to putative variants followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.