The SARS-CoV-2 S protein is a major point of interaction between the virus and the human immune system. As a consequence, the S protein is not a static target but undergoes rapid molecular evolution. In order to more fully understand the selection pressure during evolution, we examined residue positions in the S protein that vary greatly across closely related viruses but are conserved in the subset of viruses that infect humans. These "evolutionarily important" residues were not distributed evenly across the S protein but were concentrated in two domains: the N-terminal domain and the receptor-binding domain, both of which play a role in host cell binding in a number of related viruses. In addition to being localized in these two domains, evolutionary importance correlated with structural flexibility and inversely correlated with distance from known or predicted host receptor-binding residues. Finally, we observed a bias in the composition of the amino acids that make up such residues toward more human-like, rather than virus-like, sequence motifs.
Motivation
The scoring of antibody-antigen docked poses starting from unbound homology models has not been systematically optimized for a large and diverse set of input sequences.
Results
To address this need, we have developed AbAdapt, a web server that accepts antibody and antigen sequences, models their 3D structures, predicts epitope and paratope, and then docks the modeled structures using two established docking engines (Piper and Hex). Each of the key steps has been optimized by developing and training new machine-learning models. The sequences from a diverse set of 622 antibody-antigen pairs with known structure were used as inputs for leave-one-out cross validation. The final set of cluster representatives included at least one “Adequate” pose for 550/622 (88.4%) of the queries. The median (IQR) ranks of these “Adequate” poses were 22 (5 to 77). Similar results were obtained on a holdout set of 100 unrelated antibody-antigen pairs. When epitopes were re-predicted using docking-derived features for specific antibodies, the median ROC AUC increased from 0.679 to 0.720 in cross validation and from 0.694 to 0.730 in the holdout set.
Availability
AbAdapt is available at https://sysimm.org/abadapt/.
Supplementary information
Supplementary data are available at Bioinformatics Advances online.
Antibodies recognize their cognate antigens with high affinity and specificity, but the prediction of binding sites on the antigen (epitope) corresponding to a specific antibody remains a challenging problem. To address this problem, we developed AbAdapt, a pipeline that integrates antibody and antigen structural modeling with rigid docking in order to derive antibody-antigen specific features for epitope prediction. In this study, we systematically assessed the impact of integrating the state-of-the-art protein modeling method AlphaFold with the AbAdapt pipeline. By incorporating more accurate antibody models, we observed improvement in docking, paratope prediction, and prediction of antibody-specific epitopes. We further applied AbAdapt-AF in an anti-receptor binding domain (RBD) antibody complex benchmark and found AbAdapt-AF outperformed three alternative docking methods. Also, AbAdapt-AF demonstrated higher epitope prediction accuracy than other tested epitope prediction tools in the anti-RBD antibody complex benchmark. We anticipate that AbAdapt-AF will facilitate prediction of antigen-antibody interactions in a wide range of applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.