2024
DOI: 10.1101/2024.05.16.594602
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Accounting for digestion enzyme bias in Casanovo

Carlo Melendez,
Justin Sanders,
Melih Yilmaz
et al.

Abstract: A key parameter of any proteomics mass spectrometry experiment is the identity of the enzyme that is used to digest proteins in the sample into peptides. The Casanovo de novo sequencing model was trained using data that was generated with trypsin digestion; consequently, the model prefers to predict peptides that end with the amino acids "K" or "R." This bias is desirable when the Casanovo is used to analyze data that was also generated using trypsin but can be problematic if the data was generated using some … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 14 publications
0
1
0
Order By: Relevance
“…We anticipate a number of future directions for work to further improve Cascadia on DIA data. First, Cascadia can be specifically fine-tuned for specific applications of interest, for example by training on a dataset of MHC bound peptides to improve immunoproteomics analysis [28] or non-tryptic data to ameliorate the tryptic bias of the model [29]. Furthermore, there are many successful ideas proposed by de novo methods in the DDA setting which may also be beneficial in the DIA setting, including the addition of additional auxiliary training tasks [4,9], alternate decoding strategies [6,15], and post-processing algorithms to refine predictions [19,30].…”
Section: Discussionmentioning
confidence: 99%
“…We anticipate a number of future directions for work to further improve Cascadia on DIA data. First, Cascadia can be specifically fine-tuned for specific applications of interest, for example by training on a dataset of MHC bound peptides to improve immunoproteomics analysis [28] or non-tryptic data to ameliorate the tryptic bias of the model [29]. Furthermore, there are many successful ideas proposed by de novo methods in the DDA setting which may also be beneficial in the DIA setting, including the addition of additional auxiliary training tasks [4,9], alternate decoding strategies [6,15], and post-processing algorithms to refine predictions [19,30].…”
Section: Discussionmentioning
confidence: 99%