Therapeutic enzyme engineering using a generative neural network

Giessel, Andrew J.; Dousis, Athanasios; Ravichandran, Kanchana R.; Smith, Kevin; Sur, Sreyoshi; McFadyen, Iain; Zheng, Wei; Licht, Stuart S

doi:10.1038/s41598-022-05195-x

Cited by 38 publications

(48 citation statements)

References 60 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Changing the encoding to amino acid properties or a learned representation for protein sequences (Alley et al 2019; Rao et al 2019; Wittmann et al 2021) would give RecGen additional information that could help to predict more reliably. Another way to adapt the network would be to change the fully connected neural network layers to convolutional or recurrent network layers, which could further improve the performance (Hawkins-Hooker et al 2021; Giessel et al 2022). While improvements to the algorithm are important, the data used to train the model is likely to be even more critical.…”

Section: Discussionmentioning

confidence: 99%

“…Thanks to this approach, it is possible to sample from this data distribution to generate new sequences. The algorithms that are most commonly used for protein sequence generation are Generative Adversarial Networks (GANs; (Goodfellow et al 2014; Gupta and Zou 2018; Repecka et al 2021) and Variational Autoencoders (VAEs; (Kingma and Welling 2013; Riesselman et al 2018; Costello and Martin 2019; Davidsen et al 2019; Das et al 2021; Hawkins-Hooker et al 2021; Giessel et al 2022)). In addition to these two, there are also deep learning algorithms for natural language processing that have been adapted for generative modeling.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Prediction of designer-recombinases for DNA editing with generative deep learning

Schmitt

Paszkowski‐Rogacz

Jug

et al. 2022

Preprint

View full text Add to dashboard Cite

Site-specific tyrosine-type recombinases are effective tools for genome engineering, with the first engineered variants having demonstrated therapeutic potential. So far, adaptation to new DNA target site selectivity of designer-recombinases has been achieved mostly through iterative cycles of directed molecular evolution. While effective, directed molecular evolution methods are laborious and time consuming. Here we present RecGen (Recombinase Generator), an algorithm for the intelligent generation of designer-recombinases. We gathered the sequence information of over two million Cre-like recombinase sequences evolved for 89 different target sites with which we trained Conditional Variational Autoencoders for recombinase generation. Experimental validation demonstrated that the algorithm can predict recombinase sequences with activity on novel target-sites, indicating that RecGen is useful to accelerate the development of future designer-recombinases.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Prediction of designer-recombinases for DNA editing with generative deep learning

Schmitt

Paszkowski‐Rogacz

Jug

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Experimental results showed that TDVAE generated a large ensemble of HsS1PL variants with preserved functional features, including the presence of the key catalytic lysine residue. Surprisingly, obtaining these variants did not require training on large sequence datasets or multiple sequence alignment information [21], which are usually difficult to obtain when sequences are highly divergent as S1PL. We then further validated our results by predicting the structure of a subset of variants and performing molecular dynamics simulations to assess enzyme structural stability and integrity; here we found HsS1PL variants to maintain favorable inter-chain contacts to form stable, compact and largely invariant homodimeric complexes.…”

Section: Discussionmentioning

confidence: 99%

“…However, current methods are usually limited by the number of homolog sequences and the corresponding quality of multiple sequence alignments, along with the availability of known tertiary structures. Recently, instead, deep generative learning has proven to be a viable solution to generate new, unobserved functional proteins [21], either by learning evolutionary constraints from highly curated multiple sequence alignments [22] or directly from protein sequences [23, 24]. However, these methods require a large number of sequences to be effectively trained and extensive computational resources.…”

Section: Introductionmentioning

confidence: 99%

Designing human Sphingosine-1-phosphate lyases using a temporal Dirichlet variational autoencoder

Lobzaev

Herrera

Campopiano

et al. 2022

Preprint

View full text Add to dashboard Cite

Enzymatic deficiencies cause the accumulation of toxic levels of substrates in a cell and are associated with life-threatening pathologies. Restoring physiological enzymes levels by injecting a recombinant version of the defective enzyme could provide a viable therapeutic option. However, these enzyme replacement therapies have had limited success, as the recombinant enzymes are less catalytically active, cause immune response and are difficult to manufacture. Moreover, the vast sequence design space makes finding enzymes with desired therapeutic properties extremely challenging. Here, we present a new enzyme engineering framework, which builds on recent advances in deep learning, variational calculus and natural language processing, to design variants of human enzymes with biochemical features comparable to the wild type protein as a way to rapidly build targeted libraries for downstream screening. We applied our method to design variants of human Sphyngosine-1-phosphate lyase (HsS1PL) as potential therapeutic treatments for nephrotic syndrome type 14 (NPHS14), and characterized their biochemical properties through extensive sequence and molecular dynamics analyses.

show abstract

“…The authors pointed out that higher-order coevolutionary effects have the potential to provide a deeper understanding of the structure− function relationship in novel ways. 125 In 2022, Notin et al used ESM-1v, 126 a protein language model, and MSA transformer 127 in the framework of tranception, an approach based on autoregressive transformers and inference-time retrieval for fitness prediction of various proteins and enzymes. 128 They pointed out that one limitation of the approach is that it neglects potentially important epistatic effects.…”

Section: Deep-learning Modelsmentioning

confidence: 99%

Learning Epistasis and Residue Coevolution Patterns: Current Trends and Future Perspectives for Advancing Enzyme Engineering

2022

View full text Add to dashboard Cite

Engineering proteins and enzymes with the desired functionality has broad applications in molecular biology, biotechnology, biomedical sciences, health, and medicine. The vastness of protein sequence space and all the possible proteins it represents can pose a considerable barrier for enzyme engineering campaigns through directed evolution and rational design. The nonlinear effects of coevolution between amino acids in protein sequences complicate this further. Data-driven models increasingly provide scientists with the computational tools to navigate through the largely undiscovered forest of protein variants and catch a glimpse of the rules and effects underlying the topology of sequence space. In this review, we outline a complete theoretical journey through the processes of protein engineering methods such as directed evolution and rational design and reflect on these strategies and data-driven hybrid strategies in the context of sequence space. We discuss crucial phenomena of residue coevolution, such as epistasis, and review the history of models created over the past decade, aiming to infer rules of protein evolution from data and use this knowledge to improve the prediction of the structure− function relationship of proteins. Data-driven models based on deep learning algorithms are among the most promising methods that can account for the nonlinear phenomena of sequence space to some degree. We also critically discuss the available models to predict evolutionary coupling and epistatic effects (classical and deep learning) in terms of their capabilities and limitations. Finally, we present our perspective on possible future directions for developing data-driven approaches and provide key orientation points and necessities for the future of the fast-evolving field of enzyme engineering.

show abstract

Therapeutic enzyme engineering using a generative neural network

Cited by 38 publications

References 60 publications

Prediction of designer-recombinases for DNA editing with generative deep learning

Prediction of designer-recombinases for DNA editing with generative deep learning

Designing human Sphingosine-1-phosphate lyases using a temporal Dirichlet variational autoencoder

Learning Epistasis and Residue Coevolution Patterns: Current Trends and Future Perspectives for Advancing Enzyme Engineering

Contact Info

Product

Resources

About