SCHEMA structure-guided recombination of 3 fungal class II cellobiohydrolases (CBH II cellulases) has yielded a collection of highly thermostable CBH II chimeras. Twenty-three of 48 genes sampled from the 6,561 possible chimeric sequences were secreted by the Saccharomyces cerevisiae heterologous host in catalytically active form. Five of these chimeras have half-lives of thermal inactivation at 63°C that are greater than the most stable parent, CBH II enzyme from the thermophilic fungus Humicola insolens, which suggests that this chimera collection contains hundreds of highly stable cellulases. Twenty-five new sequences were designed based on mathematical modeling of the thermostabilities for the first set of chimeras. Ten of these sequences were expressed in active form; all 10 retained more activity than H. insolens CBH II after incubation at 63°C. The total of 15 validated thermostable CBH II enzymes have high sequence diversity, differing from their closest natural homologs at up to 63 amino acid positions. Selected purified thermostable chimeras hydrolyzed phosphoric acid swollen cellulose at temperatures 7 to 15°C higher than the parent enzymes. These chimeras also hydrolyzed as much or more cellulose than the parent CBH II enzymes in long-time cellulose hydrolysis assays and had pH/activity profiles as broad, or broader than, the parent enzymes. Generating this group of diverse, thermostable fungal CBH II chimeras is the first step in building an inventory of stable cellulases from which optimized enzyme mixtures for biomass conversion can be formulated. biofuels ͉ cellobiohydrolase ͉ cellulose hydrolysis ͉ Trichoderma reesei ͉ CBH II T he performance of cellulase mixtures in biomass conversion processes depends on many enzyme properties including stability, product inhibition, synergy among different cellulase components, productive binding versus nonproductive adsorption and pH dependence, in addition to the cellulose substrate physical state and composition. Given the multivariate nature of cellulose hydrolysis, it is desirable to have diverse cellulases to choose from to optimize enzyme formulations for different applications and feedstocks. Recent studies have documented the superior performance of cellulases from thermophilic fungi relative to their mesophilic counterparts in laboratory scale biomass conversion processes (1, 2), where enhanced stability leads to retention of activity over longer periods of time at both moderate and elevated temperatures. Fungal cellulases are attractive because they are highly active and can be expressed in fungal hosts such as Hypocrea jecorina (anamorph Trichoderma reesei) at levels up to 40 g/L in the supernatant. Unfortunately, the set of documented thermostable fungal cellulases is small. In the case of the processive cellobiohydrolase class II (CBH II) enzymes, Ͻ10 natural thermostable gene sequences are annotated in the CAZy database (www.cazy.org). This limited number, combined with the difficulty of using directed evolution to generate diverse thermostable...
The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein’s behavior and properties. We present a supervised deep learning framework to learn the sequence–function mapping from deep mutational scanning data and make predictions for new, uncharacterized sequence variants. We test multiple neural network architectures, including a graph convolutional network that incorporates protein structure, to explore how a network’s internal representation affects its ability to learn the sequence–function mapping. Our supervised learning approach displays superior performance over physics-based and unsupervised prediction methods. We find that networks that capture nonlinear interactions and share parameters across sequence positions are important for learning the relationship between sequence and function. Further analysis of the trained models reveals the networks’ ability to learn biologically meaningful information about protein structure and mechanism. Finally, we demonstrate the models’ ability to navigate sequence space and design new proteins beyond the training set. We applied the protein G B1 domain (GB1) models to design a sequence that binds to immunoglobulin G with substantially higher affinity than wild-type GB1.
A quantitative linear model accurately (R 2 ؍ 0.88) describes the thermostabilities of 54 characterized members of a family of fungal cellobiohydrolase class II (CBH II) cellulase chimeras made by SCHEMA recombination of three fungal enzymes, demonstrating that the contributions of SCHEMA sequence blocks to stability are predominantly additive. Thirty-one of 31 predicted thermostable CBH II chimeras have thermal inactivation temperatures higher than the most thermostable parent CBH II, from Humicola insolens, and the model predicts that hundreds more CBH II chimeras share this superior thermostability. Eight of eight thermostable chimeras assayed hydrolyze the solid cellulosic substrate Avicel at temperatures at least 5°C above the most stable parent, and seven of these showed superior activity in 16-h Avicel hydrolysis assays. The sequence-stability model identified a single block of sequence that adds 8.5°C to chimera thermostability. Mutating individual residues in this block identified the C313S substitution as responsible for the entire thermostabilizing effect. Introducing this mutation into the two recombination parent CBH IIs not featuring it (Hypocrea jecorina and H. insolens) decreased inactivation, increased maximum Avicel hydrolysis temperature, and improved long time hydrolysis performance. This mutation also stabilized and improved Avicel hydrolysis by Phanerochaete chrysosporium CBH II, which is only 55-56% identical to recombination parent CBH IIs. Furthermore, the C313S mutation increased total H. jecorina CBH II activity secreted by the Saccharomyces cerevisiae expression host more than 10-fold. Our results show that SCHEMA structure-guided recombination enables quantitative prediction of cellulase chimera thermostability and efficient identification of stabilizing mutations.SCHEMA is a computational approach to identifying blocks of sequence that minimize structural disruption when they are recombined in chimeric proteins (1). SCHEMA recombination of eight blocks from three fungal cellobiohydrolase class II (CBH II) 2 genes was used in our previous work to create a library of 3 8 ϭ 6,561 chimeric sequences, all having the native Hypocrea jecorina cellulose binding module and linker and observed to feature a degree of glycosylation similar to that found in native CBH IIs secreted by fungi (2). Synthesis and characterization of selected CBH II chimeras expressed in Saccharomyces cerevisiae revealed enzymes with thermostabilities and cellulose hydrolysis performance superior to those of the parent enzymes from Humicola insolens, H. jecorina, and Chaetomium thermophilum.Our prior analysis showed that a qualitative model based on sequence-stability data from 23 functional chimeras (categorizing blocks as destabilizing, stabilizing, or neutral) could identify highly stable chimeras in the SCHEMA library (2). When studying SCHEMA recombination of a bacterial cytochrome P450, we previously estimated that building a quantitative regression model would require stability measurements for at least 35 re...
We describe an efficient SCHEMA recombination-based approach for screening homologous enzymes to identify stabilizing amino acid sequence blocks. This approach has been used to generate active, thermostable cellobiohydrolase class I (CBH I) enzymes from the 390 625 possible chimeras that can be made by swapping eight blocks from five fungal homologs. Constructing and characterizing the parent enzymes and just 32 'monomeras' containing a single block from a homologous enzyme allowed stability contributions to be assigned to 36 of the 40 blocks from which the CBH I chimeras can be assembled. Sixteen of 16 predicted thermostable chimeras, with an average of 37 mutations relative to the closest parent, are more thermostable than the most stable parent CBH I, from the thermophilic fungus Talaromyces emersonii. Whereas none of the parent CBH Is were active >65°C, stable CBH I chimeras hydrolyzed solid cellulose at 70°C. In addition to providing a collection of diverse, thermostable CBH Is that can complement previously described stable CBH II chimeras (Heinzelman et al., Proc. Natl Acad. Sci. USA 2009;106:5610-5615) in formulating application-specific cellulase mixtures, the results show the utility of SCHEMA recombination for screening large swaths of natural enzyme sequence space for desirable amino acid blocks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.