2021
DOI: 10.1021/acs.jcim.1c00699
|View full text |Cite
|
Sign up to set email alerts
|

Kernel Methods for Predicting Yields of Chemical Reactions

Abstract: The use of machine learning methods for the prediction of reaction yield is an emerging area. We demonstrate the applicability of support vector regression (SVR) for predicting reaction yields, using combinatorial data. Molecular descriptors used in regression tasks related to chemical reactivity have often been based on time-consuming, computationally demanding quantum chemical calculations, usually density functional theory. Structure-based descriptors (molecular fingerprints and molecular graphs) are quicke… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
36
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 38 publications
(37 citation statements)
references
References 62 publications
1
36
0
Order By: Relevance
“…The radius determines the number of iterations in the calculation of the identifier of the central atom. With the increase of the radius, the information of the surrounding substructure is increasingly encoded into the identifier [38,39]. Each identifier is updated iteratively to include information on neighbouring atoms (i.e., their identifier and bond order).…”
Section: Molecular Fingerprintsmentioning
confidence: 99%
“…The radius determines the number of iterations in the calculation of the identifier of the central atom. With the increase of the radius, the information of the surrounding substructure is increasingly encoded into the identifier [38,39]. Each identifier is updated iteratively to include information on neighbouring atoms (i.e., their identifier and bond order).…”
Section: Molecular Fingerprintsmentioning
confidence: 99%
“…[1][2][3][4][5][6] Similarly, developments in machine learning (ML) have enabled the distillation of large and complex data sets into predictive models capable of generalizing patterns in the data. 4,[7][8][9][10][11][12][13] Despite these advances, efforts to merge HTE with ML remains largely limited to a few reported datasets with limited structural diversity [14][15][16][17][18][19][20] and corresponding trained models that do not extrapolate well to substrates beyond the training set.…”
Section: Introductionmentioning
confidence: 99%
“…Modeling the yield of these datasets (4K C-N couplings 15 , or 2K Suzuki-Miyaura couplings in flow 14 ) produces predictive models with R 2 or AUROC > 0.9. 11,15,[27][28][29][30][31][32][33][34] However, models trained on these datasets demonstrate limited ability to extrapolate beyond the molecules in their training sets, in part due to the minimal structural diversity in the dataset.…”
Section: Introductionmentioning
confidence: 99%
“…35,36 This research has been applied to palladium catalysed reactions, including Suzuki and Buchwald-Hartwig cross-coupling reactions. 37 However, the methodology struggled when applied to patent data which had too much inconsistency for accurate yield prediction. 36 Two issues with chemical reaction data that make predictive tasks challenging are the sparsity of the data within chemical space, and a lack of reported failed experiments.…”
Section: Introductionmentioning
confidence: 99%