2023
DOI: 10.1371/journal.pcbi.1010457
|View full text |Cite
|
Sign up to set email alerts
|

Multienzyme deep learning models improve peptide de novo sequencing by mass spectrometry proteomics

Abstract: Generating and analyzing overlapping peptides through multienzymatic digestion is an efficient procedure for de novo protein using from bottom-up mass spectrometry (MS). Despite improved instrumentation and software, de novo MS data analysis remains challenging. In recent years, deep learning models have represented a performance breakthrough. Incorporating that technology into de novo protein sequencing workflows require machine-learning models capable of handling highly diverse MS data. In this study, we ana… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 71 publications
0
5
0
Order By: Relevance
“…Peptide candidates were generated using three deep-learning-based de novo peptide tools: PointNovo (51), CasaNovo (52), and InstaNovo (53). For PointNovo, two in-house multienzyme-trained models (54) were utilized, whereas default models were employed for CasaNovo and InstaNovo. The study considered twelve fragment ions: a+1, a+2, b+1, b+2, y+1, y+2, a-H2O, b-H2O, y-H2O, a-NH3, b-NH3, and y-NH3, with candidate selection based on a 20 ppm tolerance at both MS1 and MS2 levels and the observation of a minimum of four fragment ions.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Peptide candidates were generated using three deep-learning-based de novo peptide tools: PointNovo (51), CasaNovo (52), and InstaNovo (53). For PointNovo, two in-house multienzyme-trained models (54) were utilized, whereas default models were employed for CasaNovo and InstaNovo. The study considered twelve fragment ions: a+1, a+2, b+1, b+2, y+1, y+2, a-H2O, b-H2O, y-H2O, a-NH3, b-NH3, and y-NH3, with candidate selection based on a 20 ppm tolerance at both MS1 and MS2 levels and the observation of a minimum of four fragment ions.…”
Section: Methodsmentioning
confidence: 99%
“…The highest-ranked peptide candidates were those that not only showed maximal overlap but also robust MS evidence supporting the target protein sequences. The positional confidence score (54) for the assembled heterodimeric chains of the Fab domain is illustrated in Fig. 1F .…”
Section: Methodsmentioning
confidence: 99%
“…Moreover, this approach does not allow the model to generalize to settings where an MS/MS experiment makes use of novel combinations of digestion enzymes. Gueto-Tettay et al experiment with both of these approaches, first training models on data generated using a single enzyme and then training on multi-enzyme datasets to increase their models' generalizability [13].…”
Section: Introductionmentioning
confidence: 99%
“…In the example above, “PEPTIDKE” becomes the preferred sequence under digestion by gluC instead of trypsin. Therefore, the accuracy of these data-driven de novo sequencing models suffers when they are applied to non-tryptic data [12, 13].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation