2023
DOI: 10.1101/2023.08.30.555055
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

De novo peptide sequencing with InstaNovo: Accurate, database-free peptide identification for large scale proteomics experiments

Kevin Eloff,
Konstantinos Kalogeropoulos,
Oliver Morell
et al.

Abstract: Bottom-up mass spectrometry-based proteomics is challenged by the task of identifying the peptide that generates a tandem mass spectrum. Traditional methods that rely on known peptide sequence databases are limited and may not be applicable in certain contexts. De novo peptide sequencing, which assigns peptide sequences to the spectra without prior information, is valuable for various biological applications; yet, due to a lack of accuracy, it remains challenging to apply this approach in many situations. Here… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 92 publications
(133 reference statements)
0
10
0
Order By: Relevance
“…The ProteomeTools data consists of ground truth fragment ion intensities, RTs, and in some cases CCS values, that are expected to be largely free of interference due to the construction of the peptide pools. These data have been used to develop several ML tools, including the Prosit deep neural network for fragment ion intensity and RT prediction [4], and several de novo peptide sequencing tools [36], [37], [38], [39]. Additionally, ProteomeTools is often used as a benchmark dataset to evaluate prediction tools, such as the CCS prediction model from Meier et al [11].…”
Section: Proteometoolsmentioning
confidence: 99%
“…The ProteomeTools data consists of ground truth fragment ion intensities, RTs, and in some cases CCS values, that are expected to be largely free of interference due to the construction of the peptide pools. These data have been used to develop several ML tools, including the Prosit deep neural network for fragment ion intensity and RT prediction [4], and several de novo peptide sequencing tools [36], [37], [38], [39]. Additionally, ProteomeTools is often used as a benchmark dataset to evaluate prediction tools, such as the CCS prediction model from Meier et al [11].…”
Section: Proteometoolsmentioning
confidence: 99%
“…There is also a need to understand how Ab-seq experimental and computational protocols impact the coverage of antibody diversity 34,79,80 . We expect antibody repertoire studies will be facilitated in the future by the advances in both machine learning-based 43,45 and experiment-based 69,70 de novo peptide sequence analysis efforts.…”
Section: Conclusion and Recommendationsmentioning
confidence: 99%
“…Since antibodies are so similar to each other yet so diverse, and the proportion of shared clones between individuals is very low 26 , it is more sensible to create a custom reference database from the same individual to ensure higher accuracy 32,34,35,37 . Novel methods have been developed where the peptide sequence can be determined de novo (without reference sequences), but these methods are not yet well-established for antibodies [38][39][40][41][42][43][44][45] . Therefore, the integration of Ab-seq with BCR sequencing technologies holds the promise of connecting the genomic and proteomic levels of the adaptive immune repertoire.…”
Section: Introductionmentioning
confidence: 99%
“…A novel approach involving cumulative fragment-ion evidence was applied to enhance de novo peptide sequencing and subsequent primary protein structure assembly. Peptide candidates were generated using three deep-learning-based de novo peptide tools: PointNovo (51), CasaNovo (52), and InstaNovo (53). For PointNovo, two in-house multienzyme-trained models (54) were utilized, whereas default models were employed for CasaNovo and InstaNovo.…”
Section: De Novo Sequencing Of the Nab Assembling Protein And Modelin...mentioning
confidence: 99%