2007
DOI: 10.1038/nmeth1113
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised learning for peptide identification from shotgun proteomics datasets

Abstract: Shotgun proteomics uses liquid chromatography-tandem mass spectrometry to identify proteins in complex biological samples. We describe an algorithm, called Percolator, for improving the rate of confident peptide identifications from a collection of tandem mass spectra. Percolator uses semi-supervised machine learning to discriminate between correct and decoy spectrum identifications, correctly assigning peptides to 17% more spectra from a tryptic Saccharomyces cerevisiae dataset, and up to 77% more spectra fro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

10
1,894
1
2

Year Published

2008
2008
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 2,087 publications
(1,907 citation statements)
references
References 10 publications
10
1,894
1
2
Order By: Relevance
“…After identification and quantification, we processed all phosphoproteomic and proteomics results according to very stringent peptide acceptance criteria. These criteria include: (i) a false discovery rate (FDR) of less than 1% at the peptide level utilizing the Percolator-based algorithm, 33 (ii) peptides needed to have a Mascot ion score of at least 20 and (iii) peptide length was restricted between 6 and 45 residues. Subsequently, we defined two classes of phosphopeptides and phosphosites: a phosphopeptide was designated Class I when the PhosphoRS site probability for each phosphorylated residue within the phosphopeptide was at least 75%.…”
Section: Discussionmentioning
confidence: 99%
“…After identification and quantification, we processed all phosphoproteomic and proteomics results according to very stringent peptide acceptance criteria. These criteria include: (i) a false discovery rate (FDR) of less than 1% at the peptide level utilizing the Percolator-based algorithm, 33 (ii) peptides needed to have a Mascot ion score of at least 20 and (iii) peptide length was restricted between 6 and 45 residues. Subsequently, we defined two classes of phosphopeptides and phosphosites: a phosphopeptide was designated Class I when the PhosphoRS site probability for each phosphorylated residue within the phosphopeptide was at least 75%.…”
Section: Discussionmentioning
confidence: 99%
“…We used the following parameters for database searching: 20 ppm precursor mass tolerance; fully digested with trypsin; up to three missed cleavages; fixed modification: carbamidomethylation of cysteine (+57.0214); variable modifications: oxidation of methionine (+15.9949). False discovery rates (FDRs) of peptide and protein identifications were evaluated and controlled to less than 1% by the target-decoy method [57] through linear discriminant analysis (LDA) [58]. Peptides fewer than seven amino acid residues in length were deleted.…”
Section: Methodsmentioning
confidence: 99%
“…Oxidized methionine was allowed as a dynamic modification. False discovery rates (FDR) were determined by the Percolator algorithm 60 based on processing against a decoy database consisting of the shuffled target database. FDR was set at a target value of q 0.05 (5% FDR).…”
Section: Database Search and Spectral Annotationmentioning
confidence: 99%