Mass spectrometry combined with database searching has become the preferred method for identifying proteins in proteomics projects. Proteins are digested by one or several enzymes to obtain peptides, which are analyzed by mass spectrometry. We introduce a new family of scoring schemes, named OLAV, aimed at identifying peptides in a database from their tandem mass spectra. OLAV scoring schemes are based on signal detection theory, and exploit mass spectrometry information more extensively than previously existing schemes. We also introduce a new concept of structural matching that uses pattern detection methods to better separate true from false positives. We show the superiority of OLAV scoring schemes compared to MASCOT, a widely used identification program. We believe that this work introduces a new way of designing scoring schemes that are especially adapted to high-throughput projects such as GeneProt large-scale human plasma project, where it is impractical to check all identifications manually.
We present a new approach capable of assigning charge states to peptides based on both their intact mass spectrum and their fragmentation mass spectrum. More specifically, our approach aims at fully exploiting available information to improve correct charge assignment rate. This is achieved by using information provided by the fragmentation spectrum extensively. For low-resolution spectra, charge assignment based on fragmentation mass spectrum is better than charge assignment based on intact peptide signal only. We introduce two methods that allow to integrate information contributing to successful peptide charge state assignment. We demonstrate the performance of our algorithms on large ion trap data sets. The application of these algorithms to large-scale proteomics projects can save significant computation time and have a positive impact on identification false positive rates.
We propose a new type of probabilistic scoring scheme framework for protein identification from peptide masses. We first introduce the framework itself and explain its requirements. In a second part, we describe a particular implementation and test it on a data set of more than 8000 MALDI-TOF spectra with known contents. Doing so, we also compare its performance to two widely used scoring schemes, thereby demonstrating the potential of the proposed approach.
Abstract. Tandem mass spectrometry has become central in proteomics projects. In particular, it is of prime importance to design sensitive and selective score functions to reliably identify peptides in databases. By using a huge collection of 140 000+ peptide MS/MS spectra, we systematically study the importance of many characteristics of a match (peptide sequence/spectrum) to include in a score function. Besides classical match characteristics, we investigate the value of new characteristics such as amino acid dependence and consecutive fragment matches. We finally select a combination of promising characteristics and show that the corresponding score function achieves very low false positive rates while being very sensitive, thereby enabling highly automated peptide identification in large proteomics projects. We compare our results to widely used protein identification systems and show a significant reduction in false positives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.