Cross‐docking benchmark for automated pose and ranking prediction of ligand binding

Wierbowski, Shayne D.; Wingert, Bentley M.; Zheng, Jim; Camacho, Carlos J.

doi:10.1002/pro.3784

Cited by 57 publications

(61 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that the purpose of our CrossDocked2020 set is orthogonal to cross-docking benchmark datasets 51 as the goal is not to evaluate docking algorithms, but to provide a standard set of already generated poses for training, evaluating, and comparing machine learning models.…”

Section: Discussionmentioning

confidence: 99%

Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design

Francoeur

Masuda

Sunseri

et al. 2020

J. Chem. Inf. Model.

180

253

View full text Add to dashboard Cite

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard dataset of sufficient size to compare performance between models. We present a new dataset for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank and perform a comprehensive evaluation of grid-based convolutional neural network models on this dataset. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind dataset, how performance improves by adding more, lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of 5 densely connected convolutional newtworks, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized dataset for training machine learning models to recognize ligands in non-cognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this dataset for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models. File list (2) download file view on ChemRxiv crossdocked2020.pdf (3.80 MiB) download file view on ChemRxiv crossdocked2020_supplement.pdf (0.92 MiB)

show abstract

Section: Discussionmentioning

confidence: 99%

Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design

Francoeur

Masuda

Sunseri

et al. 2020

J. Chem. Inf. Model.

180

253

View full text Add to dashboard Cite

show abstract

“…Molecular modelling performed on estrogen-related receptors is not an exception. As the studies are performed mostly to search for new agonists or antagonists, the ligands re-docked into the ERs or SHBG are mainly E2 [ 52 , 53 , 54 , 55 , 56 , 57 ] and 4-hydroxytamoxifen [ 52 , 58 , 59 ]. RMSD value in most of the studies varies from 0.26 to 1.4Å, which proves the correctness of the docking methods in finding the proper orientation of the ligand in the active site.…”

Section: Application Of Molecular Modelling Methods In the Study Omentioning

confidence: 99%

“…Moreover, in terms of the ERs, RMSF and RMSD values suggest whether the analyzed molecule is the receptor’s agonist or antagonist [ 50 , 52 , 53 , 54 , 59 ]. As already mentioned in Section 1.2 , the positioning of the H12 helix is differently influenced by agonists and antagonists.…”

Section: Application Of Molecular Modelling Methods In the Study Omentioning

confidence: 99%

Application of Various Molecular Modelling Methods in the Study of Estrogens and Xenoestrogens

Mazurek

Szeleszczuk

Simonson

et al. 2020

IJMS

View full text Add to dashboard Cite

In this review, applications of various molecular modelling methods in the study of estrogens and xenoestrogens are summarized. Selected biomolecules that are the most commonly chosen as molecular modelling objects in this field are presented. In most of the reviewed works, ligand docking using solely force field methods was performed, employing various molecular targets involved in metabolism and action of estrogens. Other molecular modelling methods such as molecular dynamics and combined quantum mechanics with molecular mechanics have also been successfully used to predict the properties of estrogens and xenoestrogens. Among published works, a great number also focused on the application of different types of quantitative structure–activity relationship (QSAR) analyses to examine estrogen’s structures and activities. Although the interactions between estrogens and xenoestrogens with various proteins are the most commonly studied, other aspects such as penetration of estrogens through lipid bilayers or their ability to adsorb on different materials are also explored using theoretical calculations. Apart from molecular mechanics and statistical methods, quantum mechanics calculations are also employed in the studies of estrogens and xenoestrogens. Their applications include computation of spectroscopic properties, both vibrational and Nuclear Magnetic Resonance (NMR), and also in quantum molecular dynamics simulations and crystal structure prediction. The main aim of this review is to present the great potential and versatility of various molecular modelling methods in the studies on estrogens and xenoestrogens.

show abstract

“…At the time of this publication, the Protein Data Bank 9 (PDB) contained more than 60 crystal structures of hDHODH determined in the holo form, complexed with substrates, or inhibitors. The current structural data sufficiently allows researchers to conduct VS experiments in order to find new selective inhibitors for hDHODH [10][11][12] However, based on the binding pose prediction success rates in the recent benchmark study, hDHODH has been characterized as a "very hard" and challenging molecular target for docking 13 . Furthermore, a number of open-source academic and commercial molecular docking software packages show quite varied performances depending on protein family and also on crystal structure of the protein [13][14][15][16][17][18][19][20] .…”

Section: Introductionmentioning

confidence: 99%

Improving Reliability of the Virtual Screening Process for Human Dihydroorotate Dehydrogenase Enzyme by Combination of the Ensemble and Consensus Docking Approaches

Chilingaryan¹,

Abelyan²,

Sargsyan³

et al. 2021

Preprint

View full text Add to dashboard Cite

The inconsistencies in the performance of the virtual screening (VS) process, depending on the used software and structural conformation of the protein, is a challenging issue in the drug design and discovery field. Varying performance, especially in terms of early recognition of the potential hit compounds, negatively affects the whole process and leads to unnecessary waste of the time and resources. Appropriate application of the ensemble docking and consensus-scoring approaches can significantly increase reliability of the VS results. Dihydroorotate dehydrogenase (DHODH) is a key enzyme in the pyrimidine biosynthesis pathway. It is considered as a valuable therapeutic target in cancer, autoimmune and viral diseases. Based on the conducted benchmark study and analysis of the effect of different combinations of the applied methods and approaches, here we suggested a structure-based virtual screening (SBVS) workflow that can be used to increase the reliability of VS.

show abstract

Cross‐docking benchmark for automated pose and ranking prediction of ligand binding

Cited by 57 publications

References 26 publications

Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design

Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design

Application of Various Molecular Modelling Methods in the Study of Estrogens and Xenoestrogens

Improving Reliability of the Virtual Screening Process for Human Dihydroorotate Dehydrogenase Enzyme by Combination of the Ensemble and Consensus Docking Approaches

Contact Info

Product

Resources

About