Molecular docking can be used to predict how strongly small‐molecule binders and their chemical derivatives bind to a macromolecular target using its available three‐dimensional structures. Scoring functions (SFs) are employed to rank these molecules by their predicted binding affinity (potency). A classical SF assumes a predetermined theory‐inspired functional form for the relationship between the features characterizing the structure of the protein–ligand complex and its predicted binding affinity (this relationship is almost always assumed to be linear). Recent years have seen the prosperity of machine‐learning SFs, which are fast regression models built instead with contemporary supervised learning algorithms. In this review, we analyzed machine‐learning SFs for drug lead optimization in the 2015–2019 period. The performance gap between classical and machine‐learning SFs was large and has now broadened owing to methodological improvements and the availability of more training data. Against the expectations of many experts, SFs employing deep learning techniques were not always more predictive than those based on more established machine learning techniques and, when they were, the performance gain was small. More codes and webservers are available and ready to be applied to prospective structure‐based drug lead optimization studies. These have exhibited excellent predictive accuracy in compelling retrospective tests, outperforming in some cases much more computationally demanding molecular simulation‐based methods. A discussion of future work completes this review. This article is categorized under: Computer and Information Science > Chemoinformatics
Molecular docking predicts whether and how small molecules bind to a macromolecular target using a suitable 3D structure. Scoring functions for structure-based virtual screening primarily aim at discovering which molecules bind to the considered target when these form part of a library with a much higher proportion of non-binders. Classical scoring functions are essentially models building a linear mapping between the features describing a proteinligand complex and its binding label. Machine learning, a major subfield of artificial intelligence, can also be used to build fast supervised learning models for this task. In this review, we analyzed such machine-learning scoring functions for structure-based virtual screening in the period 2015-2019. We have discussed what the shortcomings of current benchmarks really mean and what valid alternatives have been employed. The latter retrospective studies observed that machine-learning scoring functions were substantially more accurate, in terms of higher hit rates and potencies, than the classical scoring functions they were compared to. Several of these machine-learning scoring functions were also employed in prospective studies, in which mid-nanomolar binders with novel chemical structures were directly discovered without any potency optimization. We have thus highlighted the codes and webservers that are available to build or apply machine-learning scoring functions to prospective structure-based virtual screening studies. A discussion of prospects for future work completes this review.
The superior performance of machine-learning scoring functions for docking has caused a series of debates on whether it is due to learning knowledge from training data that are similar in some sense to the test data. With a systematically revised methodology and a blind benchmark realistically mimicking the process of prospective prediction of binding affinity, we have evaluated three broadly used classical scoring functions and five machine-learning counterparts calibrated with both random forest and extreme gradient boosting using both solo and hybrid features, showing for the first time that machine-learning scoring functions trained exclusively on a proportion of as low as 8% complexes dissimilar to the test set already outperform classical scoring functions, a percentage that is far lower than what has been recently reported on all the three CASF benchmarks. The performance of machine-learning scoring functions is underestimated due to the absence of similar samples in some artificially created training sets that discard the full spectrum of complexes to be found in a prospective environment. Given the inevitability of any degree of similarity contained in a large dataset, the criteria for scoring function selection depend on which one can make the best use of all available materials. Software code and data are provided at https://github.com/cusdulab/MLSF for interested readers to rapidly rebuild the scoring functions and reproduce our results, even to make extended analyses on their own benchmarks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.