2018
DOI: 10.1101/428334
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Uncovering hidden members and functions of the soil microbiome using de novo metaproteomics

Abstract: The fundamental task in proteomic mass spectrometry is identifying peptides from their observed spectra. Where protein sequences are known, standard algorithms utilize these to narrow the list of peptide candidates. If protein sequences are unknown, a distinct class of algorithms must interpret spectra de novo. Despite decades of effort on algorithmic constructs and machine learning methods, de novo software tools remain inaccurate when used on environmentally diverse samples. Here we train a deep neural netwo… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 51 publications
0
5
0
Order By: Relevance
“…from the individual repositories. For potential future work, it is promising that more and more deep learning-specific technical data sets are made available Gessulat et al (2019); Lee et al (2018).…”
Section: Discussionmentioning
confidence: 99%
“…from the individual repositories. For potential future work, it is promising that more and more deep learning-specific technical data sets are made available Gessulat et al (2019); Lee et al (2018).…”
Section: Discussionmentioning
confidence: 99%
“…First, training using larger datasets may further improve the models. [91] The second aspect is training species-specific models using transfer learning by leveraging large datasets from other species. Since the protein sequences from different species may have different patterns, the deep learning models trained using MS/MS data from one species may not generalize well to another species, but the patterns and rules learned from other species with large datasets could benefit the training for a specific species with a relatively small dataset.…”
Section: Deep Learning For De Novo Peptide Sequencingmentioning
confidence: 99%
“…Some recent publications have applied various machine learning techniques to the de novo sequencing problem, and can be divided into two groups. Approaches in the first group use empirical machine learning as the primary method of peptide identification, such as DeepNovo, 26 Kaiko, 27 and SMSNet. 28 These tools make use of large sets of MS/MS spectra that had previously been confidently identified by other methods, such as database search with stringent quality filters.…”
Section: Machine Learning Approachesmentioning
confidence: 99%
“…It was trained on 1.7 million spectra from multiple species. Kaiko 27 built on the same deep learning framework but had a much larger training set (5 million), and demonstrated that DeepNovo as published may suffer from overfitting. Both of these papers reported improved performance relative to PEAKS and Novor, but interestingly the Kaiko paper reported that PEAKS and Novor outperformed DeepNovo but not Kaiko.…”
Section: Machine Learning Approachesmentioning
confidence: 99%