2022
DOI: 10.1021/acs.analchem.2c03491
|View full text |Cite
|
Sign up to set email alerts
|

CCS Predictor 2.0: An Open-Source Jupyter Notebook Tool for Filtering Out False Positives in Metabolomics

Abstract: Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resolution mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) analysis. Chromatographic retention time matching with standards is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing nucle… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(42 citation statements)
references
References 34 publications
0
42
0
Order By: Relevance
“…To predict CCS values, two data sets were created using the Unified CCS Compendium, one containing all entries for [M + H] + adduct species and another set for all [M – H] − entries ( n = 644 and 582, respectively). These data sets were then randomly split 75% into training sets and 25% into test sets, which were used to construct a support vector regression (SVR)-based ML model using CCSP 2.0, as illustrated in Figure . Using the optimized SVR models, CCS values for any given candidate structure could be predicted for [M + H] + and [M – H] − species from their neutral InChI code.…”
Section: Methodsmentioning
confidence: 99%
“…To predict CCS values, two data sets were created using the Unified CCS Compendium, one containing all entries for [M + H] + adduct species and another set for all [M – H] − entries ( n = 644 and 582, respectively). These data sets were then randomly split 75% into training sets and 25% into test sets, which were used to construct a support vector regression (SVR)-based ML model using CCSP 2.0, as illustrated in Figure . Using the optimized SVR models, CCS values for any given candidate structure could be predicted for [M + H] + and [M – H] − species from their neutral InChI code.…”
Section: Methodsmentioning
confidence: 99%
“…A CCS index filter defined as mean error +/− 2 SD, i.e., maximum 16.16 Å 2 , could be used as the threshold for excluding false positives. This match tolerance, reflecting the deviation of analytes or family of analytes or type of adducts, is relatively large compared to other work demonstrating that median relative errors as low as 3 to 5% are reachable using other models [ 22 , 24 , 25 , 26 , 27 ]. However, excluding false positive identifications with a CCS match higher than the defined threshold remains of great importance when considering the number of possible matches when using m/z match, isotope similarity, and fragmentation score only.…”
Section: Discussionmentioning
confidence: 78%
“…The resulting performances could have been further validated by performing a side-by-side comparison with other existing machine-learning tools. Such a comparison has already been described elsewhere [ 22 , 24 , 25 , 26 , 27 ]. Instead of that, the chosen strategy consisted of emphasizing the usefulness of our workflow with concrete application on biological data.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Such comparisons allow the researcher to gain better insights into the conformational landscape adopted. These methods have been well-reviewed ,, and broadly fall into three categories. In the first category are those that fully evaluate the trajectory of the ion as it interacts with the buffer gas so call trajectories methods (TM) including (MOBCAL-TM and IMoS). ,,,, The second category includes those that consider the projected area of the candidate structure and use empirical data to determine a CCS (PA, PSA, IMPACT). ,,,, The last category considers the recently emergent machine learning approaches. , The first two approaches rely on a reasonable starting structure, and commonly with proteins, molecular dynamics methods, both atomistic and coarse-grained that can be used to provide candidate gas-phase geometries. Such molecular dynamics (MD) evaluation can be computationally very expensive, although refinements to this have been made that integrate CCS values into the conformational searching for suitable candidate geometries.…”
Section: Developments In Ion Mobility Mass Spectrometry (Im-ms) Instr...mentioning
confidence: 99%