Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+), and 67 negative estrogen receptor (ER−) to test the accuracies of feed-forward networks, a deep learning (DL) framework, as well as six widely used machine learning models, namely random forest (RF), support vector machines (SVM), recursive partitioning and regression trees (RPART), linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), and generalized boosted models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER– patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value <0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion and absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accuracy (AUC = 0.93) and better revelation of disease biology. We encourage the adoption of feed-forward networks based deep learning method in the metabolomics research community for classification.
ObjectiveTo identify dysregulated metabolic pathways in amyotrophic lateral sclerosis (ALS) versus control participants through untargeted metabolomics.MethodsUntargeted metabolomics was performed on plasma from ALS participants (n=125) around 6.8 months after diagnosis and healthy controls (n=71). Individual differential metabolites in ALS cases versus controls were assessed by Wilcoxon rank-sum tests, adjusted logistic regression and partial least squares-discriminant analysis (PLS-DA), while group lasso explored sub-pathway-level differences. Adjustment parameters included sex, age and body mass index (BMI). Metabolomics pathway enrichment analysis was performed on metabolites selected by the above methods. Finally, machine learning classification algorithms applied to group lasso-selected metabolites were evaluated for classifying case status.ResultsThere were no group differences in sex, age and BMI. Significant metabolites selected were 303 by Wilcoxon, 300 by logistic regression, 295 by PLS-DA and 259 by group lasso, corresponding to 11, 13, 12 and 22 enriched sub-pathways, respectively. ‘Benzoate metabolism’, ‘ceramides’, ‘creatine metabolism’, ‘fatty acid metabolism (acyl carnitine, polyunsaturated)’ and ‘hexosylceramides’ sub-pathways were enriched by all methods, and ‘sphingomyelins’ by all but Wilcoxon, indicating these pathways significantly associate with ALS. Finally, machine learning prediction of ALS cases using group lasso-selected metabolites achieved the best performance by regularised logistic regression with elastic net regularisation, with an area under the curve of 0.98 and specificity of 83%.ConclusionIn our analysis, ALS led to significant metabolic pathway alterations, which had correlations to known ALS pathomechanisms in the basic and clinical literature, and may represent important targets for future ALS therapeutics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.