2023
DOI: 10.1021/acs.oprd.3c00186
|View full text |Cite
|
Sign up to set email alerts
|

Designing Chemical Reaction Arrays Using Phactor and ChatGPT

Babak Mahjour,
Jillian Hoffstadt,
Tim Cernak

Abstract: High-throughput experimentation is a common practice in the optimization of chemical synthesis. Chemists design reaction arrays to optimize the yield of couplings between building blocks. Popular reactions used in pharmaceutical research include the amide coupling, Suzuki coupling, and Buchwald–Hartwig coupling. We show how the artificial intelligence (AI) language model ChatGPT can automatically formulate reaction arrays for these common reactions based on the literature corpus it was trained on. Critically, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0
2

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 25 publications
0
9
0
2
Order By: Relevance
“…Data derived from the published literature span a wide range of substrates and reaction types, but each reactant–product combination might be reported only once or twice. In contrast, public datasets from high-throughput experimentation (HTE) exist only for a few reaction types so far (Buchwald–Hartwig amination and Suzuki coupling being the most popular datasets), although more varied datasets, both in terms of reaction types and design workflow, are emerging. , Most HTE datasets are generated through parallel plate-based chemistry in 24-, 96-, 384-, or even higher density well formats. In these experimental campaigns, some reaction variables are easy to vary via automated liquid handling capabilities (e.g., the diversity of concentrations and the combinations of additives), while other aspects (e.g., heterogeneous reactants and the diversity of solvents) are harder to vary given the practical challenges of stock solution preparation.…”
Section: Defining the Desired Domain Of Applicabilitymentioning
confidence: 99%
“…Data derived from the published literature span a wide range of substrates and reaction types, but each reactant–product combination might be reported only once or twice. In contrast, public datasets from high-throughput experimentation (HTE) exist only for a few reaction types so far (Buchwald–Hartwig amination and Suzuki coupling being the most popular datasets), although more varied datasets, both in terms of reaction types and design workflow, are emerging. , Most HTE datasets are generated through parallel plate-based chemistry in 24-, 96-, 384-, or even higher density well formats. In these experimental campaigns, some reaction variables are easy to vary via automated liquid handling capabilities (e.g., the diversity of concentrations and the combinations of additives), while other aspects (e.g., heterogeneous reactants and the diversity of solvents) are harder to vary given the practical challenges of stock solution preparation.…”
Section: Defining the Desired Domain Of Applicabilitymentioning
confidence: 99%
“…There is already work on developing automated evolution systems and integrating these into active learning workflows where data generated from automated experiments can train and refine ML models to suggest beneficial variants to explore further. ,, These “design-build-test-learn” cycles would enable continuous optimization of enzymes and other proteins (Figure ), as they can for small molecules . LLMs could power these automated systems, with AI flexibly adapting to perform new types of syntheses and screens with robotic scripts written on the fly. At the same time, multiple desirable properties and activity for multiple reactions could be optimized simultaneously during protein engineering campaigns, powered by generalized ML models that can utilize multimodal representations of proteins. With ever increasing amounts of data on protein structures and sequence-fitness pairs, and new tools to conduct experiments and make ML methods for proteins more accessible to the broader community, the future of ML-assisted protein engineering is bright.…”
Section: Conclusion: Toward General Self-driven Protein Engineeringmentioning
confidence: 99%
“…As a digital guide in this intricate domain, ChatGPT excels in route planning for intricate molecules, proposing viable pathways and intermediates to streamline the synthesis of complex compounds. [20] Furthermore, the model's expertise becomes evident in its ability to propose transformations of groups thereby outlining strategies, for synthesis improvement. The invaluable guidance extends to reagent selection, as ChatGPT skillfully recommends optimal catalysts, reagents, and conditions for specific transformations within synthetic pathways (Figure 3 shows the process by providing response to the prompt given).…”
Section: Retrosynthetic Analysismentioning
confidence: 99%
“…The invaluable guidance extends to reagent selection, as ChatGPT skillfully recommends optimal catalysts, reagents, and conditions for specific transformations within synthetic pathways (Figure 3 shows the process by providing response to the prompt given). [20] Beyond its role as a synthesizer's companion, ChatGPT serves as an insightful tutor, unravelling the principles of retrosynthetic analysis by elucidating disconnections, bond formations, and the language of retrosynthetic arrow-pushing. By utilizing its abilities, the model effectively evaluates risks, proactively pinpointing obstacles, along retrosynthetic pathways.…”
Section: Retrosynthetic Analysismentioning
confidence: 99%