Large-scale protein sequencing methods rely on enzymatic digestion of complex protein mixtures to generate a collection of peptides for mass spectrometric analysis. Here we examine the use of multiple proteases (trypsin, LysC, ArgC, AspN, and GluC) to improve both protein identification and characterization in the model organism Saccharomyces cerevisiae. Using a data-dependent, decision tree-based algorithm to tailor MS2 fragmentation method to peptide precursor, we identified 92,095 unique peptides (609,665 total) mapping to 3,908 proteins at a 1% false discovery rate (FDR). These results were a significant improvement upon data from a single protease digest (trypsin) – 27,822 unique peptides corresponding to 3,313 proteins. The additional 595 protein identifications were mainly from those at low abundances (i.e., < 1,000 copies/cell); sequence coverage for these proteins was likewise improved nearly 3-fold. We demonstrate that large portions of the proteome are simply inaccessible following digestion with a single protease and that multiple proteases, rather than technical replicates, provide a direct route to increase both protein identifications and proteome sequence coverage.
By characterizing dynamic changes in yeast protein abundance following osmotic shock, this study shows that the correlation between protein and mRNA differs for transcripts that increase versus decrease in abundance, and reveals physiological reasons for these differences.
We describe a mass spectrometry method, QuantMode, which improves the accuracy of isobaric tag–based quantification by alleviating the pervasive problem of precursor interference—co-isolation of impurities—through gas-phase purification. QuantMode analysis of a yeast sample ‘contaminated’ with interfering human peptides showed substantially improved quantitative accuracy compared to a standard scan, with a small loss of spectral identifications. This technique will allow large-scale, multiplexed quantitative proteomics analyses using isobaric tagging.
Combining high mass accuracy mass spectrometry, isobaric tagging, and novel software for multiplexed, large-scale protein quantification, we report deep proteomic coverage across multiple biological replicates and cell lines. We applied this method to study four human embryonic stem cell and four induced pluripotent stem cell lines in biological triplicate, a 24-sample comparison resulting in the largest set of identified proteins and phosphorylation sites in pluripotent cells to date. The statistical analysis afforded by this approach revealed, for the first time, subtle but reproducible differences in protein and protein phosphorylation between embryonic stem cells and induced pluripotent cells. Merging these results with RNA-seq analyses, we found functionally related differences across each tier of regulation. Finally, we introduce the Stem Cell–Omics Repository (SCOR), a resource that collates and displays quantitative information across multiple planes of measurement, including mRNA, protein, and post-translational modifications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.