2023
DOI: 10.1021/acs.jproteome.2c00711
|View full text |Cite
|
Sign up to set email alerts
|

Toward an Integrated Machine Learning Model of a Proteomics Experiment

Abstract: In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
20
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 37 publications
(20 citation statements)
references
References 141 publications
0
20
0
Order By: Relevance
“…Our experiments showcase the stability and effectiveness of PepT3 in improving the current paradigm for deep MS/MS spectrum prediction. As benchmark data sets are being established and more DSPMs are emerging, PepT3 can serve as a critical link in connecting DSPMs from standardized training to laboratory evaluation. DSPM with PepT3 holds great potential for proteomic research, particularly for highly complex and quantity-limited samples.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Our experiments showcase the stability and effectiveness of PepT3 in improving the current paradigm for deep MS/MS spectrum prediction. As benchmark data sets are being established and more DSPMs are emerging, PepT3 can serve as a critical link in connecting DSPMs from standardized training to laboratory evaluation. DSPM with PepT3 holds great potential for proteomic research, particularly for highly complex and quantity-limited samples.…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, supervised learning-based training paradigm used by these models has a limitation. 17 Variations in the characteristics of mass spectra from laboratory to laboratory and experiment to experiment 18 can lead to a generalization problem for the out-of-distribution experimental spectra. Empirically, this could weaken DSPM's effectiveness in predicting spectra and result in a degradation of performance.…”
Section: ■ Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The filtering was performed using the filter API using the -q option with MongoDB or string filters available in the GitHub repository. The model training and testing were performed using the network API with "-t prosit -e 100 -sos -s n" to train a Prosit model for 100 We utilized a Bayesian approximation of the model uncertainty, by performing model inference with dropout enabled [25,26]. The real retention time values are then plotted against the mean predicted values, with the color of the data point corresponding to the normalized variances of the predicted values.…”
Section: Methodsmentioning
confidence: 99%
“…Initially popularized in fields like medical imaging, speech recognition, computer vision, and natural language processing, these algorithms have marked milestones such as predicting folding of proteins with remarkable accuracy, making them particularly effective when applied to large and complex data 25 . Given the data-intensive nature of modern biotechnological research, proteomics is increasingly becoming a fertile ground for the application of deep learning technologies [26][27][28] .…”
Section: Introductionmentioning
confidence: 99%