Protein conformational
changes can facilitate the binding of noncognate
substrates and underlying promiscuous activities. However, the contribution
of substrate conformational dynamics to this process is comparatively
poorly understood. Here, we analyze human (hMAT2A) and
Escherichia
coli
(eMAT) methionine adenosyltransferases that have identical
active sites but different substrate specificity. In the promiscuous
hMAT2A, noncognate substrates bind in a stable conformation to allow
catalysis. In contrast, noncognate substrates sample stable productive
binding modes less frequently in eMAT owing to altered mobility in
the enzyme active site. Different cellular concentrations of substrates
likely drove the evolutionary divergence of substrate specificity
in these orthologues. The observation of catalytic promiscuity in
hMAT2A led to the detection of a new human metabolite, methyl thioguanosine,
that is produced at elevated levels in a cancer cell line. This work
establishes that identical active sites can result in different substrate
specificity owing to the effects of substrate and enzyme dynamics.
The remarkable catalytic potential of enzymes in chemical synthesis, environmental bioremediation and medical therapeutics is limited by their longevity and stability.
Machine learning (ML) has the potential to revolutionize protein engineering. However, the field currently lacks standardized and rigorous evaluation benchmarks for sequence-fitness prediction, which makes accurate evaluation of the performance of different architectures difficult. Here we propose a unifying framework for ML-driven sequence-fitness prediction. Using simulated (the NK model) and empirical sequence landscapes, we define four key performance metrics: interpolation within the training domain, extrapolation outside the training domain, robustness to sparse training data, and ability to cope with epistasis/ruggedness. We show that architectural differences between algorithms consistently affect performance against these metrics across both experimental and theoretical landscapes. Moreover, landscape ruggedness is revealed to be the greatest determinant of the accuracy of sequence-fitness prediction. We hope that this benchmarking method and the code that accompanies it will enable robust evaluation and comparison of novel architectures in this emerging field and assist in the adoption of ML for protein engineering.
Protein conformational change can facilitate the binding of non-cognate substrates and underlie promiscuous activities. However, the contribution of substrate conformational dynamics to this process is comparatively poorly understood. Here we analyse human (hMAT2A) and Escherichia coli (eMAT) methionine adenosyltransferases that have identical active sites but different substrate specificity. In the promiscuous hMAT2A, non-cognate substrates bind in a stable conformation to allow catalysis. In contrast, non-cognate substrates rarely sample stable productive binding modes in eMAT owing to increased mobility of an active site loop. Different cellular concentrations of substrate likely drove the evolutionary divergence of substrate specificity in these orthologs. The observation of catalytic promiscuity in hMAT2A led to the detection of a new human metabolite, methyl thioguanosine, that is produced at elevated level in a cancer cell line. This work establishes that identical active sites can result in different substrate specificity owing to the combined effects of both enzyme and substrate dynamics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.