2023
DOI: 10.1101/2023.05.30.542978
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep learning from harmonized peptide libraries enables retention time prediction of diverse post translational modifications

Abstract: In proteomics experiments, peptide retention time (RT) is an orthogonal property to fragmentation when assessing detection confidence. Advances in deep learning enable accurate RT prediction for any peptide from sequence alone, including those yet to be experimentally observed. Here we present Chronologer, an open-source software tool for rapid and accurate peptide RT prediction. Using new approaches to harmonize and false-discovery correct across independently collected datasets, Chronologer is built on a mas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(13 citation statements)
references
References 54 publications
0
12
0
1
Order By: Relevance
“…Chronologer [50] is a residual convolutional neural network developed for peptide RT prediction. To train their prediction tool, the authors undertook significant efforts in composing a large-scale, comprehensive peptide RT dataset by aggregating and harmonizing 11 public datasets.…”
Section: Chronologermentioning
confidence: 99%
See 2 more Smart Citations
“…Chronologer [50] is a residual convolutional neural network developed for peptide RT prediction. To train their prediction tool, the authors undertook significant efforts in composing a large-scale, comprehensive peptide RT dataset by aggregating and harmonizing 11 public datasets.…”
Section: Chronologermentioning
confidence: 99%
“…To gain insight into the current state of peptide property prediction datasets, we evaluated the performance of a transformer neural network encoder [56] in predicting the RT and CCS of a mixture of modified and unmodified peptides that were not previously seen. For RT prediction, we used a curated version of the Chronologer dataset [50], containing 2.1 million peptides and 12 PTMs. The dataset compiled by Meier et al [11], featuring ~714,000 peptides and 3 PTMs, was used for CCS prediction.…”
Section: Machine Learning Performance Is Driven By Data Availabilitymentioning
confidence: 99%
See 1 more Smart Citation
“…In this way, model performance can be optimized for specific datasets, even when only a limited amount of training data is available. Transfer learning is used by several deep learning tools [34, 44, 57] and may help to create models better suited for different scenarios, such as finetuning them for different fragmentation mechanisms, instrument platforms, and even lab‐specific data properties [58].…”
Section: Considerations For the Prediction Modelsmentioning
confidence: 99%
“…However, it is important to note that currently most PTMs found with open modification searching might not be supported by the underlying predictors. In this case, transfer learning can be used [57], as well as emerging algorithmic solutions that can predict peptide properties for unseen PTMs [34, 68]. Additionally, even though several approaches have been proposed to control the FDR in open modification searching [69], further research is necessary to investigate how the FDR behaves and to ensure that PSM rescoring does not introduce biases.…”
Section: Future Perspectivesmentioning
confidence: 99%