2021
DOI: 10.1093/bib/bbab225
|View full text |Cite
|
Sign up to set email alerts
|

Machine-learning scoring functions trained on complexes dissimilar to the test set already outperform classical counterparts on a blind benchmark

Abstract: The superior performance of machine-learning scoring functions for docking has caused a series of debates on whether it is due to learning knowledge from training data that are similar in some sense to the test data. With a systematically revised methodology and a blind benchmark realistically mimicking the process of prospective prediction of binding affinity, we have evaluated three broadly used classical scoring functions and five machine-learning counterparts calibrated with both random forest and extreme … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 23 publications
0
8
0
Order By: Relevance
“…The BindingNet data set can also be used to develop and benchmark ML-based models for binding affinity and binding pose prediction , and molecular generation. PDBbind is the most widely used training data set for ML methods, , and the binding affinity prediction performances of ML-based scoring functions on PDBbind have been reported with the R p around 0.8. However, the insufficient size and sparsity of PDBbind have resulted in poor generalization capability on out-of-distribution data sets. ,, Both Yang et al and Mastropietro et al have found that the ligand memorization often dominates the predictions of ML-based models. , In addition, Zhu et al performed an interpretable analysis of PDBbind-trained models in 2022 and revealed that these models rely on the buried SASA-related features to make predictions . Since the BindingNet data set has significantly increased the number of available complex structures, it would be interesting to explore the performance of ML models trained on BindingNet.…”
Section: Resultsmentioning
confidence: 99%
“…The BindingNet data set can also be used to develop and benchmark ML-based models for binding affinity and binding pose prediction , and molecular generation. PDBbind is the most widely used training data set for ML methods, , and the binding affinity prediction performances of ML-based scoring functions on PDBbind have been reported with the R p around 0.8. However, the insufficient size and sparsity of PDBbind have resulted in poor generalization capability on out-of-distribution data sets. ,, Both Yang et al and Mastropietro et al have found that the ligand memorization often dominates the predictions of ML-based models. , In addition, Zhu et al performed an interpretable analysis of PDBbind-trained models in 2022 and revealed that these models rely on the buried SASA-related features to make predictions . Since the BindingNet data set has significantly increased the number of available complex structures, it would be interesting to explore the performance of ML models trained on BindingNet.…”
Section: Resultsmentioning
confidence: 99%
“…Towards the acceleration of traditional CADD and AIDD [88,89], this article puts forward the concept of a GIBAC, and argues that the time is now ripe to build an accurate and efficient GIBAC for the prediction of novel interactions with desired affinities [24,47,90,91] on the genome scale, better characterization of signaling networks and design of novel binding partners, either small molecules or therapeutic proteins, for various disease-related targets [43,[92][93][94][95].…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…While these considerations are very important in the development of new methods and it is important to take them into account when comparing different models, in practical applications the similarity between the training set and the system under investigation can be exploited to obtain superior predictions compared to classical SFs. For example, Li et al (2021a) argue that the performance of ML scoring functions is underestimated due to the artificial removal of similarities between the training and tests sets and put forward a new benchmark with tries to mimic prospective binding affinity predictions. However, it is important to keep in mind that ML and DL SFs might be less effective when dealing with novel targets or small molecules (Su et al, 2020), and the applicability domain needs to be clearly defined.…”
Section: Cross-validation and Data Splittingmentioning
confidence: 99%