A cheminformatics approach to characterize metabolomes in stable-isotope-labeled organisms

Tsugawa, Hiroshi; Nakabayashi, Ryo; Mori, Tetsuya; Yamada, Yutaka; Takahashi, Mikiko; Rai, Amit; Sugiyama, Ryosuke; Yamamoto, Hiroyuki; Nakaya, Taiki; Yamazaki, Mami; Kooke, Rik; Bac-Molenaar, Johanna A.; Oztolan-Erol, Nihal; Keurentjes, Joost J. B.; Arita, Masanori; Saito, Kazuki

doi:10.1038/s41592-019-0358-2

Cited by 221 publications

(219 citation statements)

References 36 publications

Supporting

Mentioning

212

Contrasting

Unclassified

Order By: Relevance

“…At present, three strategies for structural classification exist: a) Cluster compounds based on spectral similarity, then propagate compound class annotations from database search in a semiautomated manner [14][15][16] b) Search for the query compound in a spectral library 17,18 or a structure database 19,20 ; consider the top k hits for assigning compound classes. c) Use machine learning methods to directly predict compound classes from the MS/MS spectrum 19,21 . None of these strategies can address all challenges mentioned above, as we detail in the Methods section; furthermore, no ready-to-use computational tools for automated compound class annotation from LC-MS data are publicly available.…”

Section: Introductionmentioning

confidence: 99%

Classes for the masses: Systematic classification of unknowns using fragmentation spectra

Dührkop

Nothias

Fleischauer

et al. 2020

Preprint

View full text Add to dashboard Cite

Metabolomics experiments can employ non-targeted tandem mass spectrometry to detect hundreds to thousands of molecules in a biological sample. Structural annotation of molecules is typically carried out by searching their fragmentation spectra in spectral libraries or, recently, in structure databases. Annotations are limited to structures present in the library or database employed, prohibiting a thorough utilization of the experimental data. We present a computational tool for systematic compound class annotation: CANOPUS uses a deep neural network to predict 1,270 compound classes from fragmentation spectra, and explicitly targets compounds where neither spectral nor structural reference data are available. CANOPUS even predicts classes for which no MS/MS training data are available. We demonstrate the broad utility of CANOPUS by investigating the effect of the microbial colonization in the digestive system in mice, and through analysis of the chemodiversity of different Euphorbia plants; both uniquely revealing biological insights at the compound class level.DNN is trained on 1.11 million compound structures and does not require any MS/MS data. To train the DNN, we have to simulate a "realistic" probabilistic fingerprint for any given molecular structure, although no MS/MS data for this structure is available. This integration of two machine learning techniques allows CANOPUS to reach high-quality predictions for 1,270 compound classes. Because the predictions are now independent from the availability of MS/MS reference data, CANOPUS can predict compound classes even when there are no MS/MS spectra for training the method. Equally important, it can predict classes for which MS/MS training data is missing for a complete subclass.Uniquely, CANOPUS permits a global over view of the compound classes measured in a biological sample, but also the differences between cohorts at the compound class level

show abstract

Section: Introductionmentioning

confidence: 99%

Classes for the masses: Systematic classification of unknowns using fragmentation spectra

Dührkop

Nothias

Fleischauer

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…The main goal of this step is to select the features arising from a unique metabolite signal among each cluster by using the multi-level optimization of modularity algorithm 28 . Feature clustering is first based on the peak character estimation algorithm computed by MS-DIAL, which aggregates several possible relationships at the same RT range: ion correlation among samples, MS/MS fragments in higher m/z , possible adducts and chromatogram correlations 22 . Additionally, we also implemented an index of possible neutral loss and a calculation of dimers/heteromers to tag clustered feature relationships.…”

Section: Resultsmentioning

confidence: 99%

“…Starting from the aligned peak list files determined by the MS-DIAL deconvolution process, our R package firstly removes noise signals by using generic filters. In the second step, the package groups the ion features based on the results of the MS-DIAL peak character estimation algorithm 22 providing the ion linkages of adducts, correlated chromatograms, putative ion source fragments candidates and similar metabolite profiles among samples. In the third step, clustered ion features are merged between positive ionization (PI) and negative ionization (NI) modes and the adduct relationships are corrected accordingly.…”

Section: Figurementioning

confidence: 99%

MS-CleanR: A feature-filtering approach to improve annotation rate in untargeted LC-MS based metabolomics

Fraisier-Vannier

Chervin

Cabanac

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

5Untargeted metabolomics using liquid chromatography-mass spectrometry (LC-MS) is currently the euteiches; a group of glycosylated triterpenoids overrepresented in resistant lines were identified as 2 9candidate compounds conferring pathogen resistance. MS-CleanR is implemented through a Shiny 3 0 interface for intuitive use by end-users (available at: https://github.com/eMetaboHUB/MS-CleanR). 3 1

show abstract

“…Cheminformatics strategy for comprehensive metabolite characterization is previously described . Peaks were extracted from raw data using Qualitative Mass Hunter software (B.05.00) with ‘Find Compounds’ by ‘Find by Auto MS/MS.…”

Section: Resultsmentioning

confidence: 99%

Rapid annotation and structural characterization of saponins in the active fraction of Albizia julibrissin by high‐performance liquid chromatography coupled with quadrupole time of flight mass spectrometry based on accurate mass database

Liu

et al. 2019

J of Separation Science

View full text Add to dashboard Cite

The purified active fraction of Albizia julibrissin saponin was proved to be a promising adjuvant candidate for vaccine. In this study, a simple, convenient, and practical strategy was established for characterizing the saponins in this purified active fraction. The personal accurate mass database including chemical structure, molecular formula, and theoretical mass was first constructed by collecting 110 reported known saponins from genus Albizia species. The raw data was obtained by high‐performance liquid chromatography coupled with quadrupole time of flight mass spectrometry. The potential compounds were extracted from raw data, and matched with the accurate mass databases. A series of saponin compounds were predicted and their chemical structures were characterized by interpreting the tandem mass spectrometry data. A total of 29 saponins including 10 new compounds and 5 first found saponins from A. julibrissin were successfully characterized in this purified active fraction using this new strategy.

show abstract

A cheminformatics approach to characterize metabolomes in stable-isotope-labeled organisms

Cited by 221 publications

References 36 publications

Classes for the masses: Systematic classification of unknowns using fragmentation spectra

Classes for the masses: Systematic classification of unknowns using fragmentation spectra

MS-CleanR: A feature-filtering approach to improve annotation rate in untargeted LC-MS based metabolomics

Rapid annotation and structural characterization of saponins in the active fraction of Albizia julibrissin by high‐performance liquid chromatography coupled with quadrupole time of flight mass spectrometry based on accurate mass database

Contact Info

Product

Resources

About