BackgroundWhile commonly assumed in the biochemistry community that the control of metabolic pathways is thought to be critical to cellular function, it is unclear if metabolic pathways generally have evolutionarily stable rate limiting (flux controlling) steps.ResultsA set of evolutionary simulations using a kinetic model of a metabolic pathway was performed under different conditions to evaluate the evolutionary stability of rate limiting steps. Simulations used combinations of selection for steady state flux, selection against the cost of molecular biosynthesis, and selection against the accumulation of high concentrations of a deleterious intermediate. Two mutational regimes were used, one with mutations that on average were neutral to molecular phenotype and a second with a preponderance of activity-destroying mutations. The evolutionary stability of rate limiting steps was low in all simulations with non-neutral mutational processes. Clustering of parameter co-evolution showed divergent inter-molecular evolutionary patterns under different evolutionary regimes.ConclusionsThis study provides a null model for pathway evolution when compensatory processes dominate with potential applications to predicting pathway functional change. This result also suggests a possible mechanism in which studies in statistical genetics that aim to associate a genotype to a phenotype assuming independent action of variants may be mis-specified through a mis-characterization of the link between individual gene function and pathway function. A better understanding of the genotype-phenotype map has potential applications in differentiating between compensatory changes and directional selection on pathways as well as detecting SNPs and fixed differences that might have phenotypic effects.ReviewersThis article was reviewed by Arne Elofsson, David Ardell, and Shamil Sunyaev.Electronic supplementary materialThe online version of this article (doi:10.1186/s13062-016-0133-6) contains supplementary material, which is available to authorized users.
Motivation Selecting the optimal machine learning (ML) model for a given dataset is often challenging. Automated ML (AutoML) has emerged as a powerful tool for enabling the automatic selection of ML methods and parameter settings for the prediction of biomedical endpoints. Here, we apply the tree-based pipeline optimization tool (TPOT) to predict angiographic diagnoses of coronary artery disease (CAD). With TPOT, ML models are represented as expression trees and optimal pipelines discovered using a stochastic search method called genetic programing. We provide some guidelines for TPOT-based ML pipeline selection and optimization-based on various clinical phenotypes and high-throughput metabolic profiles in the Angiography and Genes Study (ANGES). Results We analyzed nuclear magnetic resonance-derived lipoprotein and metabolite profiles in the ANGES cohort with a goal to identify the role of non-obstructive CAD patients in CAD diagnostics. We performed a comparative analysis of TPOT-generated ML pipelines with selected ML classifiers, optimized with a grid search approach, applied to two phenotypic CAD profiles. As a result, TPOT-generated ML pipelines that outperformed grid search optimized models across multiple performance metrics including balanced accuracy and area under the precision-recall curve. With the selected models, we demonstrated that the phenotypic profile that distinguishes non-obstructive CAD patients from no CAD patients is associated with higher precision, suggesting a discrepancy in the underlying processes between these phenotypes. Availability and implementation TPOT is freely available via http://epistasislab.github.io/tpot/. Contact jhmoore@upenn.edu Supplementary information Supplementary data are available at Bioinformatics online.
With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning ( AutoML ) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool ( TPOT ) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While warranting independent replication, our tandem rank-accuracy measure suggests homocysteine to be the metabolite feature with largest effect, and corresponding priority for further translational clinical research. Residual training and adjustment for a potential confounding effect by BMI only slightly modified the suggested association. Increased homocysteine is thought to be associated with vitamin B12 deficiency – evaluation for potential clinical relevance is suggested. While considerations for clinical metabolic profiling are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.