Background
The analysis of LC-MS metabolomic datasets appears to be a challenging task in a wide range of disciplines since it demands the highly extensive processing of a vast amount of data. Different LC-MS data analysis packages have been developed in the last few years to facilitate this analysis. However, most of these strategies involve chromatographic alignment and peak shaping and often associate each “feature” (i.e., chromatographic peak) with a unique m/z measurement. Thus, the development of an alternative data analysis strategy that is applicable to most types of MS datasets and properly addresses these issues is still a challenge in the metabolomics field.
Results
Here, we present an alternative approach called ROIMCR to: i) filter and compress massive LC-MS datasets while transforming their original structure into a data matrix of features without losing relevant information through the search of regions of interest (ROIs) in the m/z domain and ii) resolve compressed data to identify their contributing pure components without previous alignment or peak shaping by applying a Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) analysis. In this study, the basics of the ROIMCR method are presented in detail and a detailed description of its implementation is also provided. Data were analyzed using the MATLAB (The MathWorks, Inc.,
www.mathworks.com
) programming and computing environment. The application of the ROIMCR methodology is described in detail, with an example of LC-MS data generated in a lipidomic study and with other examples of recent applications.
Conclusions
The methodology presented here combines the benefits of data filtering and compression based on the searching of ROI features, without the loss of spectral accuracy. The method has the benefits of the application of the powerful MCR-ALS data resolution method without the necessity of performing chromatographic peak alignment or modelling. The presented method is a powerful alternative to other existing data analysis approaches that do not use the MCR-ALS method to resolve LC-MS data. The ROIMCR method also represents an improved strategy compared to the direct applications of the MCR-ALS method that use less-powerful data compression strategies such as binning and windowing. Overall, the strategy presented here confirms the usefulness of the ROIMCR chemometrics method for analyzing LC-MS untargeted metabolomics data.
Electronic supplementary material
The online version of this article (10.1186/s12859-019-2848-8) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.