A main challenge in the enumeration of smallmolecule chemical spaces for drug design is to quickly and accurately differentiate between possible and impossible molecules. Current approaches for screening enumerated molecules (e.g., 2D heuristics and 3D force fields) have not been able to achieve a balance between accuracy and speed. We have developed a new automated approach for fast and high-quality screening of small molecules, with the following steps: (1) for each molecule in the set, an ensemble of 2D descriptors as feature encoding is computed; (2) on a random small subset, classification (feasible/ infeasible) targets via a 3D-based approach are generated; (3) a classification dataset with the computed features and targets is formed and a machine learning model for predicting the 3D approach's decisions is trained; and (4) the trained model is used to screen the remainder of the enumerated set. Our approach is ≈8× (7.96× to 8.84×) faster than screening via 3D simulations without significantly sacrificing accuracy; while compared to 2Dbased pruning rules, this approach is more accurate, with better coverage of known feasible molecules. Once the topological features and 3D conformer evaluation methods are established, the process can be fully automated, without any additional chemistry expertise.
A main challenge in the enumeration small molecule chemical spaces for drug design is to quickly and accurately di?erentiate between possible and impossible molecules. Current approaches for screening enumerated molecules (e.g. 2D heuristics, 3D forcefields) have not been able to achieve a balance between accuracy and speed. We have developed a new automated approach for fast and high-quality screening of small molecules, with the following steps: 1) for each molecules in the set, compute an ensemble of 2D descriptors as feature encoding, 2) on a random small subset, generate classi?cation (feasible/infeasible) targets via a 3D-based approach, 3) form a classi?cation dataset with the computed features and targets, and train a machine learning model for predicting the 3D approach's decisions, 4) use the trained model to screen the remainder of the enumerated set. Our approach is ? 8? (7.96? to 8.84?) faster than screening via 3D simulations without signi?cantly sacri?cing accuracy; whilst compared to 2D-based pruning rules, this approach is more accurate, with better coverage of known feasible molecules. Once the topological features and 3D conformer evaluation methods are established, the process can be fully automated, without any additional chemistry expertise.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.