Late-stage functionalization (LSF) is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, an LSF platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in LSF, the computational model correctly predicted the reactivity of 81% of novel substrates, while reaction yields for diverse reaction conditions were predicted with a mean absolute error margin of 4–5%. The regioselectivity of the major products was accurately captured in up to 90% of the cases studied. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and quantum mechanical information on model performance was quantified and a new comprehensive simple user-friendly reaction format (SURF) is introduced which proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation (HTE) for LSF.
Late-stage functionalization (LSF) is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, an LSF platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in LSF, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4–5%, while the reactivity of novel reactions with known and unknown substrates were classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured in up to 90% of the cases studied. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and quantum mechanical information on model performance was quantified and a new comprehensive simple user-friendly reaction format (SURF) is introduced which proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation (HTE) for LSF.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.