Prediction of protein complex structures and interfaces potentially has wide applications and can benefit the study of biological mechanisms involving protein-protein interactions. However, the surface prediction accuracy of traditional docking methods and AlphaFold-Multimer is limited. Here we present ColabDock, a framework that makes use of ColabDesign, but reimplements it for the purpose of restrained complex conformation prediction. With a generation-prediction architecture and trained ranking model, ColabDock outperforms HADDOCK and ClusPro not only in complex structure predictions with simulated residue and surface restraints, but also in those assisted by NMR chemical shift perturbation as well as covalent labeling. It further assists antibody-antigen interface prediction with emulated interface scan restraints, which could be obtained by experiments such as Deep Mutation Scan. ColabDock provides a general approach to integrate sparse interface restraints of different experimental forms and sources into one optimization framework.
The rise in the number of protein sequences in the post-genomic era has led to a major breakthrough in fitting generative sequence models for contact prediction, protein design, alignment, and homology search. Despite this success, the interpretability of the modeled pairwise parameters continues to be limited due to the entanglement of coevolution, phylogeny, and entropy. For contact prediction, post-correction methods have been developed to remove the contribution of entropy from the predicted contact maps. However, all remaining applications that rely on the raw parameters, lack a direct method to correct for entropy. In this paper, we investigate the origins of the entropy signal and propose a new spectral regularizer to down weight it during model fitting. We find the added regularizer to GREMLIN, a Markov Random Field or Potts model, allows for the inference of a sparse contact map without loss in precision, meanwhile improving interpretability, and resolving overfitting issues important for sequence evaluation and design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.