Illuminating the dark matter in metabolomics

Silva, Ricardo R. da; Dorrestein, Pieter C.; Quinn, Robert A.

doi:10.1073/pnas.1516878112

Cited by 443 publications

(387 citation statements)

References 19 publications

Supporting

Mentioning

380

Contrasting

Unclassified

Order By: Relevance

“…Analogous datasets and cyberinfrastructure are now emerging that can be applied to investigate DOM chemistry, for example MetaboLights (www.ebi.ac.uk/metabolights/) and Global Natural Products Social Molecular Networking (GNPS; gnps.ucsd.edu/ProteoSAFe/static/gnps-splash.jsp), which emphasize mass spectrometry knowledge capture and dissemination using social networking. In concert with data accessibility (141), new or existing infrastructure can be specifically dedicated to the growing needs of the DOM community. Well-engineered data systems adopted by collaborating scientists are the key to keeping up with the burgeoning capacity to generate data.…”

Section: Box 1: Cyberinfrastructurementioning

confidence: 99%

Deciphering ocean carbon in a changing world

Moran

Kujawinski

Stubbins

et al. 2016

Proc. Natl. Acad. Sci. U.S.A.

Self Cite

267

233

View full text Add to dashboard Cite

Dissolved organic matter (DOM) in the oceans is one of the largest pools of reduced carbon on Earth, comparable in size to the atmospheric CO 2 reservoir. A vast number of compounds are present in DOM, and they play important roles in all major element cycles, contribute to the storage of atmospheric CO 2 in the ocean, support marine ecosystems, and facilitate interactions between organisms. At the heart of the DOM cycle lie molecular-level relationships between the individual compounds in DOM and the members of the ocean microbiome that produce and consume them. In the past, these connections have eluded clear definition because of the sheer numerical complexity of both DOM molecules and microorganisms. Emerging tools in analytical chemistry, microbiology, and informatics are breaking down the barriers to a fuller appreciation of these connections. Here we highlight questions being addressed using recent methodological and technological developments in those fields and consider how these advances are transforming our understanding of some of the most important reactions of the marine carbon cycle.dissolved organic matter | marine microbes | cyberinfrastructure

show abstract

Section: Box 1: Cyberinfrastructurementioning

confidence: 99%

Deciphering ocean carbon in a changing world

Moran

Kujawinski

Stubbins

et al. 2016

Proc. Natl. Acad. Sci. U.S.A.

Self Cite

267

233

View full text Add to dashboard Cite

show abstract

“…This approach paired with high acquisitions speed (>1 Hz) of state of the art instruments results in thousands of spectra per LC-MS/MS run. For a reliable data analysis and reproducible interpretation of the results, bioinformatic workflows including comprehensive databases and statistical significance estimation are crucial (da Silva et al, 2015;Böcker, 2017;Scheubert et al, 2017;Weber et al, 2017) and have been very recently employed for marine metabolomic studies Hartmann et al, 2017;Kujawinski et al, 2017;Longnecker and Kujawinski, 2017). With these new bioinformatic tools and instrumental improvements in sensitivity, acquisition speed and resolution we anticipate that the techniques used for DOM characterization will further shift toward non-targeted analyses using high-resolution LC-MS/MS that provide inventories of molecular structures in complex environmental datasets.…”

Section: Introductionmentioning

confidence: 99%

High-Resolution Liquid Chromatography Tandem Mass Spectrometry Enables Large Scale Molecular Characterization of Dissolved Organic Matter

et al. 2017

Self Cite

View full text Add to dashboard Cite

Dissolved organic matter (DOM) is arguably one of the most complex exometabolomes on earth, and is comprised of thousands of compounds, that together contribute more than 600 × 10 15 g carbon. This reservoir is primarily the product of interactions between the upper ocean's microbial food web, yet abiotic processes that occur over millennia have also modified many of its molecules. The compounds within this reservoir play important roles in determining the rate and extent of element exchange between inorganic reservoirs and the marine biosphere, while also mediating microbe-microbe interactions. As such, there has been a widespread effort to characterize DOM using high-resolution analytical methods including nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS). To date, molecular information in DOM has been primarily obtained through calculated molecular formulas from exact mass. This approach has the advantage of being non-targeted, accessing the inherent complexity of DOM. Molecular structures are however still elusive and the most commonly used instruments are costly. More recently, tandem mass spectrometry has been employed to more precisely identify DOM components through comparison to library mass spectra. Here we describe a data acquisition and analysis workflow that expands the repertoire of high-resolution analytical approaches available to access the complexity of DOM molecules that are amenable to electrospray ionization (ESI) MS. We couple liquid chromatographic separation with tandem MS (LC-MS/MS) and a data analysis pipeline, that integrates peak extraction from extracted ion chromatograms (XIC), molecular formula calculation and molecular networking. This provides more precise structural characterization. Although only around 1% of detectable DOM compounds can be annotated through publicly available spectral libraries, community-wide participation in populating and annotating DOM datasets could rapidly increase the annotation rate and should be broadly encouraged. Our analysis also identifies shortcomings of the current Petras et al. LC-MS/MS Analysis of DOMdata analysis workflow that need to be addressed by the community in the future. This work will lay the foundation for an integrative, non-targeted molecular analysis of DOM which, together with next generation sequencing, meta-proteomics and physical data, will pave the way to a more comprehensive understanding of the role of DOM in structuring marine ecosystems.

show abstract

“…Most tools compare individual fragmentation spectra to reference spectra (5, 7) stored in public databases, for example, MassBank (8) or Human Metabolome Database (9), and are thus constrained by the limited number of reference spectra (10)(11)(12). Poor identification coverage can result in poor biochemical insight.…”

mentioning

confidence: 99%

Topic modeling for untargeted substructure exploration in metabolomics

Hooft

Wandy

Barrett

et al. 2016

Proc. Natl. Acad. Sci. U.S.A.

305

281

View full text Add to dashboard Cite

The potential of untargeted metabolomics to answer important questions across the life sciences is hindered because of a paucity of computational tools that enable extraction of key biochemically relevant information. Available tools focus on using mass spectrometry fragmentation spectra to identify molecules whose behavior suggests they are relevant to the system under study. Unfortunately, fragmentation spectra cannot identify molecules in isolation but require authentic standards or databases of known fragmented molecules. Fragmentation spectra are, however, replete with information pertaining to the biochemical processes present, much of which is currently neglected. Here, we present an analytical workflow that exploits all fragmentation data from a given experiment to extract biochemically relevant features in an unsupervised manner. We demonstrate that an algorithm originally used for text mining, latent Dirichlet allocation, can be adapted to handle metabolomics datasets. Our approach extracts biochemically relevant molecular substructures ("Mass2Motifs") from spectra as sets of co-occurring molecular fragments and neutral losses. The analysis allows us to isolate molecular substructures, whose presence allows molecules to be grouped based on shared substructures regardless of classical spectral similarity. These substructures, in turn, support putative de novo structural annotation of molecules. Combining this spectral connectivity to orthogonal correlations (e.g., common abundance changes under system perturbation) significantly enhances our ability to provide mechanistic explanations for biological behavior.metabolomics | mass spectrometry | fragmentation | bioinformatics | topic modeling M ass spectrometry (MS)-based metabolomics aims to capture the entire small-molecule composition of biological systems. Analysis of MS metabolomics data are challenging as many molecules cannot be identified from their mass (e.g., isobaric molecules, and isomers) (1-3). Separation by liquid chromatography before MS (LC-MS) can add discriminatory information but does not solve the problem as isomers can exhibit similar chromatographic behavior, and chromatographic retention time is currently unpredictable.Fragmentation spectra have been used to partially overcome this problem (4-6). Most tools compare individual fragmentation spectra to reference spectra (5, 7) stored in public databases, for example, MassBank (8) or Human Metabolome Database (9), and are thus constrained by the limited number of reference spectra (10-12). Poor identification coverage can result in poor biochemical insight. We propose a method that analyzes all acquired fragmentation spectra to expose underlying biochemistry without relying on metabolite identification, inspired by machine-learning techniques developed initially for text processing (13).The paucity of techniques that share information across fragmentation spectra can be explained by the complexity of fragmentation data (14). One example, "Molecular Networking," clusters MS1 peaks by th...

show abstract

Illuminating the dark matter in metabolomics

Cited by 443 publications

References 19 publications

Deciphering ocean carbon in a changing world

Deciphering ocean carbon in a changing world

High-Resolution Liquid Chromatography Tandem Mass Spectrometry Enables Large Scale Molecular Characterization of Dissolved Organic Matter

Topic modeling for untargeted substructure exploration in metabolomics

Contact Info

Product

Resources

About