The discovery of new molecules and materials helps expand the horizons of novel and innovative real-life applications. In the pursuit of finding molecules with desired properties, chemists have traditionally relied...
Spectroscopy is the study of how matter interacts with electromagnetic radiation. The spectra of any molecule are highly information-rich, yet the inverse relation of spectra to the corresponding molecular structure is still an unsolved problem. Nuclear magnetic resonance (NMR) spectroscopy is one such critical technique in the scientists’ toolkit to characterize molecules. In this work, a novel machine learning framework is proposed that attempts to solve this inverse problem by navigating the chemical space to find the correct structure given an NMR spectra. The proposed framework uses a combination of online Monte Carlo tree search (MCTS) and a set of graph convolution networks to build a molecule iteratively. Our method can predict the structure of the molecule ∼80% of the time in its top 3 guesses for molecules with <10 heavy atoms. We believe that the proposed framework is a significant step in solving the inverse design problem of NMR spectra.
Drug design involves the process of identifying and designing novel molecules that have desirable properties and bind well to a given target receptor. Typically, such molecules are identified by screening large chemical libraries for desirable physicochemical properties and binding strength with the target protein. This traditional approach, however, has severe limitations as exhaustively screening every molecule in known chemical libraries is computationally infeasible. Furthermore, currently available molecular libraries are only a minuscule part of the entire set of possible drug-like molecular structures (drug-like chemical space). In this review, we discuss how the former limitation is addressed by modeling virtual screening as a search space problem and how these endeavors utilize machine learning to reduce the number of required computational experiments to identify top candidates. We follow that up by discussing generative methods that attempt to approximate the entire drug-like chemical space providing us a path to explore beyond the known drug-like chemical space. We place special emphasis on generative models that learn the marginal distributions conditioned on specific properties or receptor structures for efficient sampling of molecules. Through this review, we aim to highlight modern machine learning based methods that try to efficiently enhance our sampling capability beyond conventional screening methods which, in turn, would benefit drug design significantly. Therefore, we also encourage further methods of development that work on such important aspects of drug design.
Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities.
Spectroscopy is the study of how matter interacts with electromagnetic radiations of specific frequencies that has led to several monumental discoveries in science. The spectra of any particular molecule is highly information-rich, yet the inverse relation from the spectra to the molecular structure is still an unsolved problem. Nuclear Magnetic Resonance (NMR) spectroscopy is one such critical tool in the tool-set for scientists to characterise any chemical sample. In this work, a novel framework is proposed that attempts to solve this inverse problem by navigating the chemical space to find the correct structure that resulted in the target spectra. The proposed framework uses a combination of online Monte- Carlo-Tree-Search (MCTS) and a set of offline trained Graph Convolution Networks to build a molecule iteratively from scratch. Our method is able to predict the correct structure of the molecule ∼80% of the time in its top 3 guesses. We believe that the proposed framework is a significant step in solving the inverse design problem of NMR spectra to molecule.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.