Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resolution mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) analysis. Chromatographic retention time matching with standards is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing nuclear magnetic resonance (NMR) spectroscopy. The measurement of gas-phase collision cross-section (CCS) values by ion mobility (IM) spectrometry also adds an important dimension to this workflow by generating an additional molecular parameter that can be used for filtering unlikely structures. The millisecond timescale of IM spectrometry allows the rapid measurement of CCS values and allows easy pairing with existing MS workflows. Here, we report on a highly accurate machine learning algorithm (CCSP 2.0) in an open-source Jupyter Notebook format to predict CCS values based on linear support vector regression models. This tool allows customization of the training set to the needs of the user, enabling the production of models for new adducts or previously unexplored molecular classes. CCSP produces predictions with accuracy equal to or greater than existing machine learning approaches such as CCSbase, DeepCCS, and AllCCS, while being better aligned with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Another unique aspect of CCSP 2.0 is its inclusion of a large library of 1613 molecular descriptors via the Mordred Python package, further encoding the fine aspects of isomeric molecular structures. CCS prediction accuracy was tested using CCS values in the McLean CCS Compendium with median relative errors of 1.25, 1.73, and 1.87% for the 170 [M − H] − , 155 [M + H] + , and 138 [M + Na] + adducts tested. For superclass-matched data sets, CCS predictions via CCSP allowed filtering of 36.1% of incorrect structures while retaining a total of 100% of the correct annotations using a Δ CCS threshold of 2.8% and a mass error of 10 ppm.
Ion mobility (IM) spectrometry provides semiorthogonal data to mass spectrometry (MS), showing promise for identifying unknown metabolites in complex non-targeted metabolomics data sets. While current literature has showcased IM−MS for identifying unknowns under near ideal circumstances, less work has been conducted to evaluate the performance of this approach in metabolomics studies involving highly complex samples with difficult matrices. Here, we present a workflow incorporating de novo molecular formula annotation and MS/MS structure elucidation using SIRIUS 4 with experimental IM collision cross-section (CCS) measurements and machine learning CCS predictions to identify differential unknown metabolites in mutant strains of Caenorhabditis elegans. For many of those ion features, this workflow enabled the successful filtering of candidate structures generated by in silico MS/MS predictions, though in some cases, annotations were challenged by significant hurdles in instrumentation performance and data analysis. While for 37% of differential features we were able to successfully collect both MS/MS and CCS data, fewer than half of these features benefited from a reduction in the number of possible candidate structures using CCS filtering due to poor matching of the machine learning training sets, limited accuracy of experimental and predicted CCS values, and lack of candidate structures resulting from the MS/MS data. When using a CCS error cutoff of ±3%, on average, 28% of candidate structures could be successfully filtered. Herein, we identify and describe the bottlenecks and limitations associated with the identification of unknowns in non-targeted metabolomics using IM−MS to focus and provide insights into areas requiring further improvement.
The interpretation of ion mobility coupled to mass spectrometry (IM-MS) data to predict unknown structures is challenging and depends on accurate theoretical estimates of the molecular ion collision cross section (CCS) against a buffer gas in a low or atmospheric pressure drift chamber. The sensitivity and reliability of computational prediction of CCS values depend on accurately modeling the molecular state over accessible conformations. In this work, we developed an efficient CCS computational workflow using a machine learning model in conjunction with standard DFT methods and CCS calculations. Furthermore, we have performed Traveling Wave IM-MS (TWIMS) experiments to validate the extant experimental values and assess uncertainties in experimentally measured CCS values. The developed workflow yielded accurate structural predictions and provides unique insights into the likely preferred conformation analyzed using IM-MS experiments. The complete workflow makes the computation of CCS values tractable for a large number of conformationally flexible metabolites with complex molecular structures.
Guanidinoacetate methyltransferase (GAMT) deficiency is an autosomal recessive genetic disorder which results in global developmental delay and intellectual disability. There is evidence that early treatment prevents intellectual disability and seizures. GAMT deficiency is now being discussed as a potential addition to the U.S. Recommended Uniform Screening Panel (RUSP); the availability of suitable screening methods must be considered. A neonatal screening derivatized method to quantify creatine (CRE) and guanidinoacetic acid (GAA) in dried blood spots by tandem mass spectrometry (MS/MS) has been described. Its key feature is the ability to detect CRE and GAA in the same extract generated from neonatal DBS during amino acids (AA) and acylcarnitines (AC) analysis. More laboratories are adopting non-derivatized MS/MS screening methods. We describe an improved, non-derivatized DBS extraction and MS/MS analytical method (AAAC-GAMT) which incorporates quantitation of CRE and GAA into routine analysis of amino acids, acylcarnitines, and succinylacetone. The non-derivatized AAAC-GAMT method performs comparably to the stand-alone GAMT and non-derivatized AAAC screening methods, evidencing its potential suitability for high-throughput GAMT neonatal screening.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.