Predicting the Viability of Beta-Lactamase: How Folding and Binding Free Energies Correlate with Beta-Lactamase Fitness

Yang, Jordan; Naik, Nandita; Patel, Jagdish Suresh; Wylie, C. Scott; Gu, Wenze; Huang, Jessie; Ytreberg, F. Marty; Naik, Mandar T.; Weinreich, Daniel; Rubenstein, Brenda M.

doi:10.1101/2020.04.15.043661

2020

DOI: 10.1101/2020.04.15.043661

|View full text |Cite

Preprint

Predicting the Viability of Beta-Lactamase: How Folding and Binding Free Energies Correlate with Beta-Lactamase Fitness

Jordan Yang

Nandita Naik

Jagdish Suresh Patel

et al.

Abstract: One of the long-standing holy grails of molecular evolution has been the ability to predict an organism's fitness directly from its genotype. With such predictive abilities in hand, researchers would be able to more accurately forecast how organisms will evolve and how proteins with novel functions could be engineered, leading to revolutionary advances in medicine and biotechnology. In this work, we assemble the largest reported set of experimental TEM-1 β-lactamase folding free energies and use this data in c… Show more

Help me understand this report

View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2020

Publication Types

Select...

Preprint1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 98 publications

(83 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

Machine Learning-Assisted Directed Evolution Navigates a Combinatorial Epistatic Fitness Landscape with Minimal Screening Burden

Wittmann¹,

Yue²,

Arnold³

2020

Preprint

View full text Add to dashboard Cite

Due to screening limitations, in directed evolution (DE) of proteins it is rarely feasible to fully evaluate combinatorial mutant libraries made by mutagenesis at multiple sites. Instead, DE often involves a single-step greedy optimization in which the mutation in the highest-fitness variant identified in each round of single-site mutagenesis is fixed. However, because the effects of a mutation can depend on the presence or absence of other mutations, the efficiency and effectiveness of a single-step greedy walk is influenced by both the starting variant and the order in which beneficial mutations are identified—the process is path-dependent. We recently demonstrated a path-independent machine learning-assisted approach to directed evolution (MLDE) that allows in silico screening of full combinatorial libraries made by simultaneous saturation mutagenesis, thus explicitly capturing the effects of cooperative mutations and bypassing the path-dependence that can limit greedy optimization. Here, we thoroughly investigate and optimize an MLDE workflow by testing a number of design considerations of the MLDE pipeline. Specifically, we (1) test the effects of different encoding strategies on MLDE efficiency, (2) integrate new models and a training procedure more amenable to protein engineering tasks, and (3) incorporate training set design strategies to avoid information-poor low-fitness protein variants (“holes”) in the training data. When applied to an epistatic, hole-filled, four-site combinatorial fitness landscape of protein G domain B1 (GB1), the resulting focused training MLDE (ftMLDE) protocol achieved the global fitness maximum up to 92% of the time at a total screening burden of 470 variants. In contrast, minimal-screening-burden single-step greedy optimization over the GB1 fitness landscape reached the global maximum just 1.2% of the time; ftMLDE matching this minimal screening burden (80 total variants) achieved the global optimum up to 9.6% of the time with a 49% higher expected maximum fitness achieved. To facilitate further development of MLDE, we present the MLDE software package (https://github.com/fhalab/MLDE), which is designed for use by protein engineers without computational or machine learning expertise.

show abstract

Machine Learning-Assisted Directed Evolution Navigates a Combinatorial Epistatic Fitness Landscape with Minimal Screening Burden

Wittmann¹,

Yue²,

Arnold³

2020

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Predicting the Viability of Beta-Lactamase: How Folding and Binding Free Energies Correlate with Beta-Lactamase Fitness

Cited by 1 publication

References 98 publications

Machine Learning-Assisted Directed Evolution Navigates a Combinatorial Epistatic Fitness Landscape with Minimal Screening Burden

Machine Learning-Assisted Directed Evolution Navigates a Combinatorial Epistatic Fitness Landscape with Minimal Screening Burden

Contact Info

Product

Resources

About