Peptide Identification by Database Search of Mixture Tandem Mass Spectra

Wang, Jian; Bourne, Philip E.; Bandeira, Nuno

doi:10.1074/mcp.m111.010017

Cited by 30 publications

(32 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When interpreting an MS/MS spectrum as a mixture spectrum M, we construct two PRM spectra, M H and M L , each generated using the corresponding scoring models for high and low-abundance peptides present in a mixture spectrum. As shown in MixDB (8), different scoring models are needed for high and low-abundance peptides because they exhibit substantially different fragmentation statistics in mixture spectra. Intuitively, this is because the low-abundance peptides will generate less intense peaks in the mixture spectrum and, in general, it also has less number of detectable peaks above noise level.…”

Section: Methodsmentioning

confidence: 99%

“…Several recent analyses show that as many as 50% of the MS/MS spectra collected in typical proteomics experiments come from more than one peptide precursor (4, 5). The presence of multiple peptides in mixture spectra can decrease their identification rate to as low as one half of that for MS/MS spectra generated from only one peptide (6,7,8). In addition, there have been numerous developments in data independent acquisition (DIA) technologies where multiple peptide precursors are intentionally selected to cofragment in each MS/MS spectrum (9,10,11,12,13,14,15).…”

mentioning

confidence: 99%

“…low reproducibility (16)) and potentially increase the throughput of peptide identification 5-10 fold (4,17). However, despite the growing importance of mixture spectra in various contexts, there are still only a few computational tools that can analyze mixture spectra from more than one peptide (18,19,20,21,8,22). Our recent analysis indicated that current database search methods for mixture spectra still have relatively low sensitivity compared with their single-peptide counterpart and the main bottleneck is their limited ability to separate true matches from false positive matches (8).…”

mentioning

confidence: 99%

“…However, despite the growing importance of mixture spectra in various contexts, there are still only a few computational tools that can analyze mixture spectra from more than one peptide (18,19,20,21,8,22). Our recent analysis indicated that current database search methods for mixture spectra still have relatively low sensitivity compared with their single-peptide counterpart and the main bottleneck is their limited ability to separate true matches from false positive matches (8). Traditionally problem of peptide identification from MS/MS spectra involves two sub-problems: 1) define a Peptide-Spectrum-Match (PSM) scoring function that assigns each MS/MS spectrum to the peptide sequence that most likely generated the spectrum; and 2) given a set of top-scoring PSMs, select a subset that corresponds to statistical significance PSMs.…”

mentioning

confidence: 99%

See 3 more Smart Citations

MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide

Jian

Bourne

Bandeira

2014

Molecular & Cellular Proteomics

Self Cite

View full text Add to dashboard Cite

In large-scale proteomic experiments, multiple peptide precursors are often cofragmented simultaneously in the same mixture tandem mass (MS/MS) spectrum. These spectra tend to elude current computational tools because of the ubiquitous assumption that each spectrum is generated from only one peptide. Therefore, tools that consider multiple peptide matches to each MS/MS spectrum can potentially improve the relatively low spectrum identification rate often observed in proteomics experiments. More importantly, data independent acquisition protocols promoting the cofragmentation of multiple precursors are emerging as alternative methods that can greatly improve the throughput of peptide identifications but their success also depends on the availability of algorithms to identify multiple peptides from each MS/MS spectrum. Here we address a fundamental question in the identification of mixture MS/MS spectra: determining the statistical significance of multiple peptides matched to a given MS/MS spectrum. We propose the MixGF generating function model to rigorously compute the statistical significance of peptide identifications for mixture spectra and show that this approach improves the sensitivity of current mixture spectra database search tools by a Ϸ30 -390%. Analysis of multiple data sets with MixGF reveals that in complex biological samples the number of identified mixture spectra can be as high as 20% of all the identified spectra and the number of unique peptides identified only in mixture spectra can be up to 35.4% of those identified in single-peptide spectra. 1, 2, 3). In typical experiments, tens of thousands to millions of MS/MS spectra are generated and enable researchers to probe various aspects of the proteome on a large scale. Part of this success hinges on the availability of computational methods that can analyze the large amount of data generated from these experiments. The classical question in computational proteomics asks: given an MS/MS spectrum, what is the peptide that generated the spectrum? However, it is increasingly being recognized that this assumption that each MS/MS spectrum comes from only one peptide is often not valid. Several recent analyses show that as many as 50% of the MS/MS spectra collected in typical proteomics experiments come from more than one peptide precursor (4, 5). The presence of multiple peptides in mixture spectra can decrease their identification rate to as low as one half of that for MS/MS spectra generated from only one peptide (6,7,8). In addition, there have been numerous developments in data independent acquisition (DIA) technologies where multiple peptide precursors are intentionally selected to cofragment in each MS/MS spectrum (9,10,11,12,13,14,15). These emerging technologies can address some of the enduring disadvantages of traditional data-dependent acquisition (DDA) methods (e.g. low reproducibility (16)) and potentially increase the throughput of peptide identification 5-10 fold (4, 17). However, despite the growing importance of mixture spectra in var...

show abstract

Section: Methodsmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide

Jian

Bourne

Bandeira

2014

Molecular & Cellular Proteomics

Self Cite

View full text Add to dashboard Cite

show abstract

“…For cross-linked peptides one evaluates how well a pair of peptides matches to an MS/MS spectrum. In our previous method, MixDB (42), we introduced a probabilistic model to score how well a pair of peptides matches to a mixture MS/MS spectrum from co-eluting peptides. The statistical framework used here extends that of MixDB by further capturing the specific fragmentation pattern of linked peptides.…”

Section: Methodsmentioning

confidence: 99%

Combinatorial Approach for Large-scale Identification of Linked Peptides from Tandem Mass Spectrometry Spectra

Jian

Anania²,

Knott

et al. 2014

Molecular & Cellular Proteomics

Self Cite

View full text Add to dashboard Cite

The combination of chemical cross-linking and mass spectrometry has recently been shown to constitute a powerful tool for studying protein-protein interactions and elucidating the structure of large protein complexes. However, computational methods for interpreting the complex MS/MS spectra from linked peptides are still in their infancy, making the high-throughput application of this approach largely impractical. Because of the lack of large annotated datasets, most current approaches do not capture the specific fragmentation patterns of linked peptides and therefore are not optimal for the identification of cross-linked peptides. Here we propose a generic approach to address this problem and demonstrate it using disulfide-bridged peptide libraries to (i) efficiently generate large mass spectral reference data for linked peptides at a low cost and (ii) automatically train an algorithm that can efficiently and accurately identify linked peptides from MS/MS spectra. We show that using this approach we were able to identify thousands of MS/MS spectra from disulfide-bridged peptides through comparison with proteome-scale sequence databases and significantly improve the sensitivity of cross-linked peptide identification. This allowed us to identify 60% more direct pairwise interactions between the protein subunits in the 20S proteasome complex than existing tools on crosslinking studies of the proteasome complexes. The basic framework of this approach and the MS/MS reference dataset generated should be valuable resources for the future development of new tools for the identification of linked peptides. Molecular & Cellular

show abstract

Enhancement of mass spectrometry performance for proteomic analyses using high‐field asymmetric waveform ion mobility spectrometry (FAIMS)

2015

View full text Add to dashboard Cite

Remarkable advances in mass spectrometry sensitivity and resolution have been accomplished over the past two decades to enhance the depth and coverage of proteome analyses. As these technological developments expanded the detection capability of mass spectrometers, they also revealed an increasing complexity of low abundance peptides, solvent clusters and sample contaminants that can confound protein identification. Separation techniques that are complementary and can be used in combination with liquid chromatography are often sought to improve mass spectrometry sensitivity for proteomics applications. In this context, high-field asymmetric waveform ion mobility spectrometry (FAIMS), a form of ion mobility that exploits ion separation at low and high electric fields, has shown significant advantages by focusing and separating multiply charged peptide ions from singly charged interferences. This paper examines the analytical benefits of FAIMS in proteomics to separate co-eluting peptide isomers and to enhance peptide detection and quantitative measurements of protein digests via native peptides (label-free) or isotopically labeled peptides from metabolic labeling or chemical tagging experiments.

show abstract

Peptide Identification by Database Search of Mixture Tandem Mass Spectra

Cited by 30 publications

References 36 publications

MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide

MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide

Combinatorial Approach for Large-scale Identification of Linked Peptides from Tandem Mass Spectrometry Spectra

Enhancement of mass spectrometry performance for proteomic analyses using high‐field asymmetric waveform ion mobility spectrometry (FAIMS)

Contact Info

Product

Resources

About