Late-stage functionalization (LSF) is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, an LSF platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in LSF, the computational model correctly predicted the reactivity of 81% of novel substrates, while reaction yields for diverse reaction conditions were predicted with a mean absolute error margin of 4–5%. The regioselectivity of the major products was accurately captured in up to 90% of the cases studied. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and quantum mechanical information on model performance was quantified and a new comprehensive simple user-friendly reaction format (SURF) is introduced which proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation (HTE) for LSF.
Late-stage functionalization (LSF) is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, an LSF platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in LSF, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4–5%, while the reactivity of novel reactions with known and unknown substrates were classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured in up to 90% of the cases studied. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and quantum mechanical information on model performance was quantified and a new comprehensive simple user-friendly reaction format (SURF) is introduced which proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation (HTE) for LSF.
Optimizing the properties of advanced drug candidates can be facilitated by directly introducing certain chemical groups without having to synthesize the molecules from scratch. However, their chemical complexity often renders reactivity predictions and synthesis planning challenging. Herein, we introduce a graph transformer neural network (GTNN) approach for computational reaction screening and identification of substrates suitable for late-stage functionalization, taking compound alkylation via Minisci-type chemistry as an example. GTNNs were trained on experimentally generated reactions obtained from miniaturized high-throughput experimentation and literature data. Trained models were prospectively applied to predicting the reactivity of 3180 advanced heterocyclic molecules, identifying potential substrates for Minisci-type alkylation. All predicted substrates were experimentally confirmed. Multiple chemical transformations were identified for each of these compounds. Selected hits were scaled up, isolated, and characterized, delivering 30 novel, suitably functionalized molecules for medicinal chemistry. These results positively advocate GTNN models for reactivity prediction in drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.