Molecule docking has been regarded as a routine tool for drug discovery, but its accuracy highly depends on the reliability of scoring functions (SFs). With the rapid development of machine learning (ML) techniques, ML-based SFs have gradually emerged as a promising alternative for protein-ligand binding affinity prediction and virtual screening, and most of them have shown significantly better performance than a wide range of classical SFs. Emergence of more data-hungry deep learning (DL) approaches in recent years further fascinates the exploitation of more accurate SFs. Here, we summarize the progress of traditional ML-based SFs in the last few years and provide insights into recently developed DL-based SFs. We believe that the continuous improvement in ML-based SFs can surely guide the early-stage drug design and accelerate the discovery of new drugs.
This article is categorized under:Computer and Information Science > Chemoinformaticsdeep learning, machine learning, molecular docking, scoring function, structure-based drug design
| INTRODUCTIONTraditional drug discovery largely relies on the application of high-throughput screening, an experimental technique with acceptable performance but high cost and low efficiency. 1 With the rapid development of computational chemistry and computer technology, computer-aided drug design (CADD) has gradually emerged as a powerful technique in the design and development of new drug candidates in the past three decades. 2 Virtual screening (VS), an important branch of CADD, can enrich potential actives from large virtual compound libraries through in silico methods rather than real experiments, which can not only accelerate the process of drug discovery but also greatly reduce the time and resource cost. 3-5 Depending on whether the three-dimensional (3D) structure of a target is used or not, VS approaches can be classified into two major categories: ligand-based virtual screening (LBVS) and structure-based virtual screening (SBVS). 6 LBVS aims to discover active molecules through the models developed based on a set of known ligands of a target of interest, which may limit its capability to find novel chemotypes. Compared with LBVS, SBVS is considered to be a better choice to discover novel active compounds if the 3D structure of a given target is available. 7 Chao Shen and Junjie Ding are equivalent first authors.