Baseline Model for Predicting Protein–Ligand Unbinding Kinetics through Machine Learning

Amangeldiuly, Nurlybek; Karlov, Dmitry S.; Fedorov, Maxim V.

doi:10.1021/acs.jcim.0c00450

Cited by 23 publications

(34 citation statements)

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…77 A model obtained using random forest, ligand-protein interactions and information about the protein structure as descriptors achieved R 2 te of 0.43 for a test set of 28 complexes with p38 MAP kinase. 78 The study presented here had a R 2 tr higher than the method with the highest R 2 tr value, Volsurf, and R 2 te value lower than the method with the highest predictive power, the method based on SMD. On one hand, the SMD method includes more details about the ligand-protein interactions and the unbinding process.…”

Section: Discussionmentioning

confidence: 54%

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Nunes-Alves¹,

Ormersbach²,

Wade³

2021

Preprint

View full text Add to dashboard Cite

<div>There is growing consensus that the optimization of the kinetic parameters for drug-protein binding leads to improved drug efficacy. Therefore, computational methods have been developed to predict kinetic rates and to derive quantitative structure-kinetic relationships (QSKRs). Many of these methods are based on crystal structures of ligand-protein complexes. However, a drawback is that each protein-ligand complex is usually treated as having a single structure. Here, we present a modification of COMparative BINding Energy (COMBINE) analysis, which uses the structures of protein-</div><div>ligand complexes to predict binding parameters. We introduce the option to use multiple structures to describe each ligand-protein complex into COMBINE analysis and</div><div>apply this to study the effects of protein flexibility on the derivation of dissociation rate constants (k<sub>off</sub>) for inhibitors of p38 mitogen-activated protein (MAP) kinase, which has a flexible binding site. Multiple structures were obtained for each ligand-protein complex by performing docking to an ensemble of protein configurations obtained from molecular dynamics simulations. Coefficients to scale ligand-protein interaction energies determined from energy-minimized structures of ligand-protein complexes were obtained by partial least squares regression and allowed the computation of k<sub>off</sub> values. The QSKR model obtained using single, energy minimized crystal structures for each ligand-protein complex had a higher predictive power than the QSKR model obtained with multiple structures from ensemble docking. However, the incorporation of protein-ligand flexibility helped to highlight additional ligand-protein interactions that lead to longer residence times, like interactions with residues Arg67 and Asp168, which are close to the ligand in many crystal structures. These results show that COMBINE analysis is a promising method to guide the design of compounds that bind to flexible proteins with improved binding kinetics. </div>

show abstract

Section: Discussionmentioning

confidence: 54%

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Nunes-Alves¹,

Ormersbach²,

Wade³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…75 A model obtained using random forest, ligand-protein interactions and information about the protein structure as descriptors achieved R 2 te of 0.43 for a test set of 28 complexes with p38 MAP kinase. 76 The study presented here had a R 2 tr higher than the method with the highest R 2 tr value, Volsurf, and R 2 te value lower than the method with the highest predictive power, the method based on SMD. On one hand, the SMD method includes more details about the ligand-protein interactions and the unbinding process.…”

Section: Discussionmentioning

confidence: 54%

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Nunes-Alves¹,

Ormersbach²,

Wade³

2021

Preprint

View full text Add to dashboard Cite

<div>There is growing consensus that the optimization of the kinetic parameters for drug-protein binding leads to improved drug efficacy. Therefore, computational methods have been developed to predict kinetic rates and to derive quantitative structure-kinetic relationships (QSKRs). Many of these methods are based on crystal structures of ligand-protein complexes. However, a drawback is that each protein-ligand complex is usually treated as having a single structure. Here, we present a modification of COMparative BINding Energy (COMBINE) analysis, which uses the structures of protein-</div><div>ligand complexes to predict binding parameters. We introduce the option to use multiple structures to describe each ligand-protein complex into COMBINE analysis and</div><div>apply this to study the effects of protein flexibility on the derivation of dissociation rate constants (k<sub>off</sub>) for inhibitors of p38 mitogen-activated protein (MAP) kinase, which has a flexible binding site. Multiple structures were obtained for each ligand-protein complex by performing docking to an ensemble of protein configurations obtained from molecular dynamics simulations. Coefficients to scale ligand-protein interaction energies determined from energy-minimized structures of ligand-protein complexes were obtained by partial least squares regression and allowed the computation of k<sub>off</sub> values. The QSKR model obtained using single, energy minimized crystal structures for each ligand-protein complex had a higher predictive power than the QSKR model obtained with multiple structures from ensemble docking. However, the incorporation of protein-ligand flexibility helped to highlight additional ligand-protein interactions that lead to longer residence times, like interactions with residues Arg67 and Asp168, which are close to the ligand in many crystal structures, showing that COMBINE analysis is a promising method to design leads with improved kinetic rates for flexible proteins.</div>

show abstract

“…When we were preparing this article, Fedorov et al reported a similar collection of kinetic data. 21 Their data set consisted of 501 protein–ligand complexes with experimentally measured dissociation rate constants. A comprehensive comparison of the two data sets was carried out from the aspects of the k off data distribution, protein types, ligand structural diversity, and complex structures.…”

Section: Resultsmentioning

confidence: 99%

“…Fedorov’s work demonstrated what an RF model could achieve on their data set. 21 Because our data set is the same in nature as theirs, we also trained a similar RF model 34 for computing the k off value of a given protein–ligand complex on our data set. To achieve this goal, the atom pair descriptors implemented in the RF-Score scoring function 35 were adopted here to construct the RF model.…”

Section: Methodsmentioning

confidence: 99%

“…Recently, Fedorov et al have described a set of 501 protein–ligand complexes with dissociation rate constants collected from the public literature. 21 This data set was the largest data set of this kind to the best of our knowledge. In addition, Fedorov et al utilized their data set to develop a random forest (RF) model for predicting dissociation rate constants.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Public Data Set of Protein–Ligand Dissociation Kinetic Constants for Quantitative Structure–Kinetics Relationship Studies

Liu

Lin

et al. 2022

ACS Omega

View full text Add to dashboard Cite

Protein–ligand binding affinity reflects the equilibrium thermodynamics of the protein–ligand binding process. Binding/unbinding kinetics is the other side of the coin. Computational models for interpreting the quantitative structure–kinetics relationship (QSKR) aim at predicting protein–ligand binding/unbinding kinetics based on protein structure, ligand structure, or their complex structure, which in principle can provide a more rational basis for structure-based drug design. Thus far, most of the public data sets used for deriving such QSKR models are rather limited in sample size and structural diversity. To tackle this problem, we have compiled a set of 680 protein–ligand complexes with experimental dissociation rate constants ( k off ), which were mainly curated from the references accumulated for updating our PDBbind database. Three-dimensional structure of each protein–ligand complex in this data set was either retrieved from the Protein Data Bank or carefully modeled based on a proper template. The entire data set covers 155 types of protein, with their dissociation kinetic constants ( k off ) spanning nearly 10 orders of magnitude. To the best of our knowledge, this data set is the largest of its kind reported publicly. Utilizing this data set, we derived a random forest (RF) model based on protein–ligand atom pair descriptors for predicting k off values. We also demonstrated that utilizing modeled structures as additional training samples will benefit the model performance. The RF model with mixed structures can serve as a baseline for testifying other more sophisticated QSKR models. The whole data set, namely, PDBbind-koff-2020 , is available for free download at our PDBbind-CN web site ( ).

show abstract

Baseline Model for Predicting Protein–Ligand Unbinding Kinetics through Machine Learning

Cited by 23 publications

References 61 publications

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Prediction of Drug-Target Binding Kinetics for Flexible Proteins by Comparative Binding Energy Analysis

Public Data Set of Protein–Ligand Dissociation Kinetic Constants for Quantitative Structure–Kinetics Relationship Studies

Contact Info

Product

Resources

About