BackgroundMass spectrometry has become the analytical method of choice in metabolomics research. The identification of unknown compounds is the main bottleneck. In addition to the precursor mass, tandem MS spectra carry informative fragment peaks, but the coverage of spectral libraries of measured reference compounds are far from covering the complete chemical space. Compound libraries such as PubChem or KEGG describe a larger number of compounds, which can be used to compare their in silico fragmentation with spectra of unknown metabolites.ResultsWe created the MetFrag suite to obtain a candidate list from compound libraries based on the precursor mass, subsequently ranked by the agreement between measured and in silico fragments. In the evaluation MetFrag was able to rank most of the correct compounds within the top 3 candidates returned by an exact mass query in KEGG. Compared to a previously published study, MetFrag obtained better results than the commercial MassFrontier software. Especially for large compound libraries, the candidates with a good score show a high structural similarity or just different stereochemistry, a subsequent clustering based on chemical distances reduces this redundancy. The in silico fragmentation requires less than a second to process a molecule, and MetFrag performs a search in KEGG or PubChem on average within 30 to 300 seconds, respectively, on an average desktop PC.ConclusionsWe presented a method that is able to identify small molecules from tandem MS measurements, even without spectral reference data or a large set of fragmentation rules. With today's massive general purpose compound libraries we obtain dozens of very similar candidates, which still allows a confident estimate of the correct compound class. Our tool MetFrag improves the identification of unknown substances from tandem MS spectra and delivers better results than comparable commercial software. MetFrag is available through a web application, web services and as java library. The web frontend allows the end-user to analyse single spectra and browse the results, whereas the web service and console application are aimed to perform batch searches and evaluation.
We survey recent advances in algorithms for route planning in transportation networks. For road networks, we show that one can compute driving directions in milliseconds or less even at continental scale. A variety of techniques provide different trade-offs between preprocessing effort, space requirements, and query time. Some algorithms can answer queries in a fraction of a microsecond, while others can deal efficiently with real-time traffic. Journey planning on public transportation systems, although conceptually similar, is a significantly harder problem due to its inherent time-dependent and multicriteria nature. Although exact algorithms are fast enough for interactive queries on metropolitan transit systems, dealing with continent-sized instances requires simplifications or heavy preprocessing. The multimodal route planning problem, which seeks journeys combining schedule-based transportation (buses, trains) with unrestricted modes (walking, driving), is even harder, relying on approximate solutions even for metropolitan inputs. *
An important problem arising in the study of complex networks, for instance in community detection and motif finding, is the sampling of graphs with fixed degree sequence. The equivalent problem of generating random 0,1 matrices with fixed row and column sums is frequently used as a quantitative tool in ecology. It has however proven very challenging to design sampling algorithms that are both fast and unbiased.This article focusses on Markov chain approaches for sampling, where a close-to-random graph is produced by applying a large number N of small changes to a given graph. Examples are the switch chain and Curveball chain, which are both commonly used by practitioners as they are easy to implement and known to sample unbiased when N is large enough. Within theoretical research, much effort has gone into proving bounds on N . However, existing theoretical bounds are impractically large for most applications while experiments suggest that much fewer steps are needed to obtain a good sample.The contribution of this article is twofold. Firstly it is a step towards better understanding of the discrepancy between experimental observations and theoretically proven bounds. In particular, we argue that while existing Markov chain algorithms run on the set of all labelled graphs with a given degree sequence, node labels are unimportant in practice and are usually ignored in determining experimental bounds. We prove that ignoring node labels corresponds to projecting a Markov chain onto equivalence classes of isomorphic graphs and that the resulting projected Markov chain converges to its stationary distribution at least as fast as the original Markov chain. Often convergence is much faster, as we show in examples, explaining part of the difference between theory and experiments. The speed-up comes from the fact that the projected Markov 1 arXiv:1803.02624v3 [math.CO]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.