Recent years have seen an explosion of interest in understanding the physicochemical parameters that shape enzyme evolution, as well as substantial advances in computational enzyme design. This review discusses three areas where evolutionary information can be used as part of the design process: (i) using ancestral sequence reconstruction (ASR) to generate new starting points for enzyme design efforts; (ii) learning from how nature uses conformational dynamics in enzyme evolution to mimic this process in silico; and (iii) modular design of enzymes from smaller fragments, again mimicking the process by which nature appears to create new protein folds. Using showcase examples, we highlight the importance of incorporating evolutionary information to continue to push forward the boundaries of enzyme design studies.
Computational enzyme design based on protein evolution: an overviewRoughly three decades have passed since the first attempts to design new enzymes using computational approaches [1,2], and the field has matured considerably since then. While the earliest attempts at computational enzyme design focused primarily on side-chain positioning [1][2][3][4] or on focusing the search space for in vitro directed evolution (see Glossary) studies [5], subsequent work broadly expanded the scope of the field, including the fully de novo design of new enzymes [6] (typically followed by optimization using directed evolution) and the repurposing of existing enzymes to catalyze ever more complex chemical reactions [7,8]. In addition, computational design approaches are becoming ever-more streamlined, such that there now exists a range of powerful web servers that can assist in the design process [9].In principle, computational design approaches can take two very loosely defined directions: structure-based design approaches that require some level of knowledge of the system of interest, including information about the chemical mechanisms, transition states, and key catalytic residues involved; and sequence-based design approaches that can, for example, draw on evolutionary information to predict potential hotspots for protein engineering as well as new variants with desired physicochemical properties, something that is in particular increasingly being achieved using machine-learning approaches [10].Computational approaches that require minimal knowledge of the molecular details of the chemical processes involved are attractive for their speed and efficiency, as exploring the underlying mechanisms and transition states typically requires significant experimental and/or computational effort. However, much like their experimental counterparts, such approaches are likely to hit optimization plateaus [11,12] where further improvement in activity becomes extremely challenging, and without knowledge of the underlying chemistry it can be difficult-to-impossible to overcome such plateaus. Therefore, rather than competing with each other, sequence-and structure-based approaches are highly complementary as each provides different type...