Reliable analyte identification is critical in metabolomics experiments to ensure proper interpretation of data. Due to chemical similarity of metabolites (as isobars and isomers) identification by mass spectrometry or chromatography alone can be difficult. Here we show that isomeric compounds are quite common in the metabolic space as given in common metabolite databases. Further, we show that retention information can shift dramatically between different experiments decreasing the value of external or even in-house compound databases. As a consequence the retention information in compound databases should be updated regularly, to allow a reliable identification. To do so we present a feasible and budget conscious method to guarantee updates of retention information on a regular basis using well designed compound mixtures. For this we combine compounds in “Ident-Mixes”, showing a way to distinctly identify chemically similar compounds through combinatorics and principle of exclusion. We illustrate the feasibility of this approach by comparing Gas chromatography (GC)–columns with identical properties from three different vendors and by creating a compound database from measuring these mixtures by Liquid chromatography–mass spectrometry (LC–MS). The results show the high influence of used materials on retention behavior and the ability of our approach to generate high quality identifications in a short time.
Lack of reliable peak detection impedes automated analysis of large-scale gas chromatography-mass spectrometry (GC-MS) metabolomics datasets. Performance and outcome of individual peak-picking algorithms can differ widely depending on both algorithmic approach and parameters, as well as data acquisition method. Therefore, comparing and contrasting between algorithms is difficult. Here we present a workflow for improved peak picking (WiPP), a parameter optimising, multi-algorithm peak detection for GC-MS metabolomics. WiPP evaluates the quality of detected peaks using a machine learning-based classification scheme based on seven peak classes. The quality information returned by the classifier for each individual peak is merged with results from different peak detection algorithms to create one final high-quality peak set for immediate down-stream analysis. Medium- and low-quality peaks are kept for further inspection. By applying WiPP to standard compound mixes and a complex biological dataset, we demonstrate that peak detection is improved through the novel way to assign peak quality, an automated parameter optimisation, and results in integration across different embedded peak picking algorithms. Furthermore, our approach can provide an impartial performance comparison of different peak picking algorithms. WiPP is freely available on GitHub () under MIT licence.
Using manual derivatization in gas chromatography-mass spectrometry samples have varying equilibration times before analysis which increases technical variability and limits the number of potential samples analyzed. By contrast, automated derivatization methods can derivatize and inject each sample in an identical manner. We present a fully automated (on-line) derivatization method used for targeted analysis of different matrices. We describe method optimization and compare results from using off-line and on-line derivatization protocols, including the robustness and reproducibility of the methods. Our final parameters for the derivatization process were 20 µL of methoxyamine (MeOx) in pyridine for 60 min at 30 °C followed by 80 µL N-Methyl-N-trimethylsilyltrifluoracetamide (MSTFA) for 30 min at 30 °C combined with 4 h of equilibration time. The repeatability test in plasma and liver revealed a median relative standard deviation (RSD) of 16% and 10%, respectively. Serum samples showed a consistent intra-batch median RSD of 20% with an inter-batch variability of 27% across three batches. The direct comparison of on-line versus off-line demonstrated that on-line was fit for purpose and improves repeatability with a measured median RSD of 11% compared to 17% using the same method off-line. In summary, we recommend that optimized on-line methods may improve results for metabolomics and should be used where available.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.