Untargeted metabolomics
using liquid chromatography coupled to
mass spectrometry (LC–MS) allows the detection of thousands
of metabolites in biological samples. However, LC–MS data annotation
is still considered a major bottleneck in the metabolomics pipeline
since only a small fraction of the metabolites present in the sample
can be annotated with the required confidence level. Here, we introduce
mWISE (metabolomics wise inference of speck entities), an R package
for context-based annotation of LC–MS data. The algorithm consists
of three main steps aimed at (i) matching mass-to-charge ratio values
to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, (ii)
clustering and filtering the potential KEGG candidates, and (iii)
building a final prioritized list using diffusion in graphs. The algorithm
performance is evaluated with three publicly available studies using
both positive and negative ionization modes. We have also compared
mWISE to other available annotation algorithms in terms of their performance
and computation time. In particular, we explored four different configurations
for mWISE, and all four of them outperform xMSannotator (a state-of-the-art
annotator) in terms of both performance and computation time. Using
a diffusion configuration that combines the biological network obtained
from the FELLA R package and raw scores, mWISE shows a sensitivity
mean (standard deviation) across data sets of 0.63 (0.07), while xMSannotator
achieves a sensitivity of 0.55 (0.19). We have also shown that the
chemical structures of the compounds proposed by mWISE are closer
to the original compounds than those proposed by xMSannotator. Finally,
we explore the diffusion prioritization separately, showing its key
role in the annotation process. mWISE is freely available on GitHub
() under a GPL license.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.