We carried out a test sample study to try to identify errors leading to irreproducibility, including incompleteness of peptide sampling, in LC-MS-based proteomics. We distributed a test sample consisting of an equimolar mix of 20 highly purified recombinant human proteins, to 27 laboratories for identification. Each protein contained one or more unique tryptic peptides of 1250 Da to also test for ion selection and sampling in the mass spectrometer. Of the 27 labs, initially only 7 labs reported all 20 proteins correctly, and only 1 lab reported all the tryptic peptides of 1250 Da. Nevertheless, a subsequent centralized analysis of the raw data revealed that all 20 proteins and most of the 1250 Da peptides had in fact been detected by all 27 labs. The centralized analysis allowed us to determine sources of problems encountered in the study, which include missed identifications (false negatives), environmental contamination, database matching, and curation of protein identifications. Improved search engines and databases are likely to increase the fidelity of mass spectrometry-based proteomics.
Proteomic technologies, such as yeast twohybrid, mass spectrometry (MS), protein/peptide arrays and fluorescence microscopy, yield multi-dimensional data sets, which are often quite large and either not published or published as supplementary information that is not easily searchable. Without a system in place for standardizing and sharing data, it is not fruitful for the biomedical community to contribute these types of data to centralized repositories. Even more difficult is the annotation and display of pertinent information in the context of the corresponding proteins. Wikipedia, an online encyclopedia that anyone can edit, has already proven quite successful1 and can be used as a model for sharing biological data. However, the need for experimental evidence, data standardization and ownership of data creates scientific obstacles. Here, we describe Human Proteinpedia (http://www.humanproteinpedia.org/) as a portal that overcomes many of these obstacles to provide an integrated view of the human proteome. Human Proteinpedia also allows users to contribute and edit proteomic data with two significant differences from Wikipedia: first, the contributor is expected to provide experimental evidence for the data annotated; and second, only the original contributor can edit their data. Human Proteinpedia's annotation system provides investigators with multiple options for contributing data including web forms and annotation servers. Although registration is required to contribute data, anyone can freely access the data in the repository. The web forms simplify submission through the use of pull-down menus for certain data fields and pop-up menus for standardized vocabulary terms. Distributed annotation servers using modified protein DAS (distributed annotation system) protocols developed by us (DAS protocols were originally developed for sharing mRNA and DNA data) permit contributing laboratories to maintain protein annotations locally. All protein annotations are visualized in the context of corresponding proteins in the Human Protein Reference Database (HPRD)3. Figure 1 shows tissue expression data for alpha-2-HS glycoprotein derived from three different types of experiments. Our unique effort differs significantly from existing repositories, such as PeptideAtlas and PRIDE5 in several respects. First, most proteomic repositories are restricted to one or two experimental platforms, whereas Human Proteinpedia can accommodate data from diverse platforms, including yeast two-hybrid screens, MS, peptide/protein arrays, immunohistochemistry, western blots, coimmunoprecipitation and fluorescence microscopy-type experiments. Second, Human Proteinpedia allows contributing laboratories to annotate data pertaining to six features of proteins (posttranslational modifications, tissue expression, cell line expression, subcellular localization, enzyme substrates and protein-protein interactions;). No existing repository currently permits annotation of all these features in proteins. Third, all data submitted to Human Proteinpedia...
Unidentified tandem mass spectra typically represent 50-90% of the spectra acquired in proteomics studies. This manuscript describes a novel algorithm, "Bonanza", for clustering spectra without knowledge of peptide or protein identifications. Further analysis leverages existing peptide identifications to infer related, likely valid identifications. Significantly more spectra can be identified with this approach, including spectra with unexpected potential modifications or amino-acid substitutions.
A current focus of proteomics research is the establishment of acceptable confidence measures in the assignment of protein identifications in an unknown sample. Development of new algorithmic approaches would greatly benefit from a standard reference set of spectra for known proteins for the purpose of testing and training. Here we describe an openly available library of mass spectra generated on an ABI 4700 MALDI TOF/TOF from 246 known, individually purified and trypsin-digested protein samples. The initial full release of the Aurum Dataset includes gel images, peak lists, spectra, search result files, decoy database analysis files, FASTA file of protein sequences, manual curation, and summary pages describing protein coverage and peptides matched by MS/MS followed by decoy database analysis using Mascot, Sequest, and X!Tandem. The data are publicly available for use at
Multiple reaction monitoring (MRM) is a highly sensitive method of targeted mass spectrometry (MS) that can be used to selectively detect and quantify peptides based on the screening of specified precursor peptide-to-fragment ion transitions. MRM-MS sensitivity depends critically on the tuning of instrument parameters, such as collision energy and cone voltage, for the generation of maximal product ion signal. Although generalized equations and values exist for such instrument parameters, there is no clear indication that optimal signal can be reliably produced for all types of MRM transitions using such an algorithmic approach. To address this issue, we have devised a workflow functional on both Waters Quattro Premier and ABI 4000 QTRAP triple quadrupole instruments that allows rapid determination of the optimal value of any programmable instrument parameter for each MRM transition. Here, we demonstrate the strategy for the optimizations of collision energy and cone voltage, but the method could be applied to other instrument parameters, such as declustering potential, as well. The workflow makes use of the incremental adjustment of the precursor and product m/z values at the hundredth decimal place to create a series of MRM targets at different collision energies that can be cycled through in rapid succession within a single run, avoiding any run-to-run variability in execution or comparison. Results are easily visualized and quantified using the MRM software package Mr. M to determine the optimal instrument parameters for each transition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.