We
present herein rPTMDetermine, an adaptive and fully automated
methodology for validation of the identification of rarely occurring
post-translational modifications (PTMs), using a semisupervised approach
with a linear discriminant analysis (LDA) algorithm. With this strategy,
verification is enhanced through similarity scoring of tandem mass
spectrometry (MS/MS) comparisons between modified peptides and their
unmodified analogues. We applied rPTMDetermine to (1) perform fully
automated validation steps for modified peptides identified from an in silico database and (2) retrieve potential yet-to-be-identified
modified peptides from raw data (that had been missed through conventional
database searches). In part (1), 99 of 125 3-nitrotyrosyl-containing
(nitrated) peptides obtained from a ProteinPilot search were validated
and localized. Twenty nitrated peptides were falsely assigned because
of incorrect monoisotopic peak assignments, leading to erroneous identification
of deamidation and nitration. Five additional nitrated peptides were,
however, validated after performing nonmonoisotopic peak correction.
In part (2), an additional 236 unique nitrated peptides were retrieved
and localized, containing 113 previously unreported nitration sites;
25 endogenous nitrated peptides with novel sites were selected and
verified by comparison with synthetic analogues. In summary, we identified
and confidently validated 296 unique nitrated peptidescollectively
representing the largest number of endogenously identified 3-nitrotyrosyl-containing
peptides from the cerebral cortex proteome of a Macaca fascicularis model of stroke. Furthermore, we harnessed the rPTMDetermine strategy
to complement conventional database searching and enhance the confidence
of assigning rarely occurring PTMs, while recovering many missed peptides.
In a final demonstration, we successfully extended the application
of rPTMDetermine to peptides featuring tryptophan oxidation.