With the recent explosion of chemical libraries beyond a billion molecules, more efficient virtual screening approaches are needed. The Deep Docking (DD) platform enables up to 100-fold acceleration of structure-based virtual screening by docking only a subset of a chemical library, iteratively synchronized with a ligand-based prediction of the remaining docking scores. This method results in hundreds-to thousands-fold virtual hit enrichment (without significant loss of potential drug candidates) and hence enables the screening of billion molecule-sized chemical libraries without using extraordinary computational resources. Herein, we present and discuss the generalized DD protocol that has been proven successful in various computer-aided drug discovery (CADD) campaigns and can be applied in conjunction with any conventional docking program. The protocol encompasses eight consecutive stages: molecular library preparation, receptor preparation, random sampling of a library, ligand preparation, molecular docking, model training, model inference and the residual docking. The standard DD workflow enables iterative application of stages 3-7 with continuous augmentation of the training set, and the number of such iterations can be adjusted by the user. A predefined recall value allows for control of the percentage of top-scoring molecules that are retained by DD and can be adjusted to control the library size reduction. The procedure takes 1-2 weeks (depending on the available resources) and can be completely automated on computing clusters managed by job schedulers. This open-source protocol, at https://github.com/jamesgleave/DD_protocol, can be readily deployed by CADD researchers and can significantly accelerate the effective exploration of ultra-large portions of a chemical space.
Deep learning-accelerated docking coupled with computational hit selection strategies enable the identification of inhibitors for the SARS-CoV-2 main protease from a chemical library of 40 billion small molecules.
Summary
Deep learning (DL) can significantly accelerate virtual screening of ultra-large chemical libraries, enabling the evaluation of billions of compounds at a fraction of the computational cost and time required by conventional docking. Here we introduce DD-GUI, the graphical user interface for such DL approach we have previously developed, termed Deep Docking (DD). The DD-GUI allows for quick setups of large-scale virtual screens in an intuitive way, and provides convenient tools to track the progress and analyze the outcomes of a drug discovery project.
Availability and Implementation
DD-GUI is freely available with an MIT license on GitHub at https://github.com/jamesgleave/DeepDockingGUI
Supplementary information
Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.