AbLEF: antibody language ensemble fusion for thermodynamically empowered property predictions

Rollins, Zachary A; Widatalla, Talal; Waight, Andrew; Cheng, Alan C; Metwally, Essam

doi:10.1093/bioinformatics/btae268

Bioinformatics

2024

DOI: 10.1093/bioinformatics/btae268

|View full text |Cite

AbLEF: antibody language ensemble fusion for thermodynamically empowered property predictions

Zachary A Rollins,

Talal Widatalla,

Andrew Waight

et al.

Abstract: Motivation Pre-trained protein language and/or structural models are often fine-tuned on drug development properties (ie, developability properties) to accelerate drug discovery initiatives. However, these models generally rely on a single structural conformation and/or a single sequence as a molecular representation. We present a physics-based model whereby 3D conformational ensemble representations are fused by a transformer-based architecture and concatenated to a language representation t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Preprint1

Relationship

Self Cite1

Independent0

Authors

Journals

Cited by 1 publication

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Aligning protein generative models with experimental fitness via Direct Preference Optimization

Widatalla,

Rafailov,

Hie

2024

Preprint

Self Cite

View full text Add to dashboard Cite

Generative models trained on unlabeled protein datasets have demonstrated a remarkable ability to predict some biological functions without any task-specific training data. However, this capability does not extend to all relevant functions and, in many cases, the unsupervised model still underperforms task-specific, supervised baselines. We hypothesize that this is due to a fundamental "alignment gap" in which the rules learned during unsupervised training are not guaranteed to be related to the function of interest. Here, we demonstrate how to provide protein generative models with useful task-specific information without losing the rich, general knowledge learned during pretraining. Using an optimization task called Direct Preference Optimization (DPO), we align a structure-conditioned language model to generate stable protein sequences by encouraging the model to prefer stabilizing over destabilizing variants given a protein backbone structure. Our resulting model, ProteinDPO, is the first structure-conditioned language model preference-optimized to experimental data. ProteinDPO achieves competitive stability prediction and consistently outperforms both unsupervised and finetuned versions of the model. Notably, the aligned model also performs well in domains beyond its training data to enable absolute stability prediction of large proteins and binding affinity prediction of multi-chain complexes, while also enabling single-step stabilization of diverse backbones. These results indicate that ProteinDPO has learned generalizable information from its biophysical alignment data.

show abstract

Aligning protein generative models with experimental fitness via Direct Preference Optimization

Widatalla,

Rafailov,

Hie

2024

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

AbLEF: antibody language ensemble fusion for thermodynamically empowered property predictions

Cited by 1 publication

References 48 publications

Aligning protein generative models with experimental fitness via Direct Preference Optimization

Aligning protein generative models with experimental fitness via Direct Preference Optimization

Contact Info

Product

Resources

About