Enzyme turnover numbers (kcat) are key to understanding cellular metabolism, proteome allocation and physiological diversity, but experimentally measured kcat data are sparse and noisy. Here we provide a deep learning approach (DLKcat) for high-throughput kcat prediction for metabolic enzymes from any organism merely from substrate structures and protein sequences. DLKcat can capture kcat changes for mutated enzymes and identify amino acid residues with a strong impact on kcat values. We applied this approach to predict genome-scale kcat values for more than 300 yeast species. Additionally, we designed a Bayesian pipeline to parameterize enzyme-constrained genome-scale metabolic models from predicted kcat values. The resulting models outperformed the corresponding original enzyme-constrained genome-scale metabolic models from previous pipelines in predicting phenotypes and proteomes, and enabled us to explain phenotypic differences. DLKcat and the enzyme-constrained genome-scale metabolic model construction pipeline are valuable tools to uncover global trends of enzyme kinetics and physiological diversity, and to further elucidate cellular metabolism on a large scale.
Yeasts are known to have versatile metabolic traits, while how these metabolic traits have evolved has not been elucidated systematically. We performed integrative evolution analysis to investigate how genomic evolution determines trait generation by reconstructing genome-scale metabolic models (GEMs) for 332 yeasts. These GEMs could comprehensively characterize trait diversity and predict enzyme functionality, thereby signifying that sequence-level evolution has shaped reaction networks towards new metabolic functions. Strikingly, using GEMs, we can mechanistically map different evolutionary events, e.g. horizontal gene transfer and gene duplication, onto relevant subpathways to explain metabolic plasticity. This demonstrates that gene family expansion and enzyme promiscuity are prominent mechanisms for metabolic trait gains, while GEM simulations reveal that additional factors, such as gene loss from distant pathways, contribute to trait losses. Furthermore, our analysis could pinpoint to specific genes and pathways that have been under positive selection and relevant for the formulation of complex metabolic traits, i.e. thermotolerance and the Crabtree effect. Our findings illustrate how multidimensional evolution in both metabolic network structure and individual enzymes drives phenotypic variations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.