2021
DOI: 10.1021/acs.accounts.0c00745
|View full text |Cite
|
Sign up to set email alerts
|

Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties

Abstract: Conspectus Machine-readable chemical structure representations are foundational in all attempts to harness machine learning for the prediction of reactivities, selectivities, and chemical properties directly from molecular structure. The featurization of discrete chemical structures into a continuous vector space is a critical phase undertaken before model selection, and the development of new ways to quantitatively encode molecules is an active area of research. In this Account, we highlight the application a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
81
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 92 publications
(81 citation statements)
references
References 55 publications
0
81
0
Order By: Relevance
“…188,189 Others are harnessing the power of machine learning methods for accelerated reaction discovery and chemical space exploration. [190][191][192] Nonetheless, we hope that we have given readers a snapshot of the utility of computational approaches through tales of transition-metalcatalyzed sigmatropic rearrangements. Timely publications describing experimental and computational aspects related to this topic have come out during the preparation of this review and we hope that interested readers will check them out for further reading.…”
Section: Discussionmentioning
confidence: 99%
“…188,189 Others are harnessing the power of machine learning methods for accelerated reaction discovery and chemical space exploration. [190][191][192] Nonetheless, we hope that we have given readers a snapshot of the utility of computational approaches through tales of transition-metalcatalyzed sigmatropic rearrangements. Timely publications describing experimental and computational aspects related to this topic have come out during the preparation of this review and we hope that interested readers will check them out for further reading.…”
Section: Discussionmentioning
confidence: 99%
“…A number of very recent reviews prove the fact that they are called to change the way in which catalysts are discovered. [51][52][53][54][55] The starting point of data-driven tools is establishing a relationship between a quantitative description of reactants, catalysts and reaction conditions with a property (for instance, activity: Quantitative Structure-Activity relationship, QSAR). The key ingredients of these models are the quantities (descriptors) that are correlated with the properties of interest.…”
Section: Designing Catalysts and Discovering New Reactionsmentioning
confidence: 99%
“…Although these techniques are still at an early stage in this field, they have already achieved impressive successes, particularly in the area of enantioselective catalysis, [51,52] proving its potential. A number of very recent reviews prove the fact that they are called to change the way in which catalysts are discovered [51–55] …”
Section: Designing Catalysts and Discovering New Reactionsmentioning
confidence: 99%
“…[23][24][25][26][27][28] When trained against a large number of experimentally measured chemical shifts, these methods have achieved predictive accuracies of 1.7 ppm for 13 C chemical shifts and 0.2 ppm for 1 H shifts (expressed as mean absolute error, MAE). 23 These earlier ML approaches tend to rely upon feature engineering 29 : expertcrafted rules are required to encode atomic environment, which can suffer from human bias and incompleteness, and which are often trained separately for different atom types (e.g., different models are developed for tetrahedral and trigonal carbon atoms). In particular, the rise of feature learning, as embodied by graph neural networks (GNNs), 30 has enabled 'end-to-end' learning from molecular structures and avoids rule-based encoding.…”
mentioning
confidence: 99%