2020
DOI: 10.26434/chemrxiv.11811564.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Data Augmentation and Pretraining for Template-Based Retrosynthetic Prediction in Computer-Aided Synthesis Planning

Abstract: This work presents efforts to augment the performance of data-driven machine learning algorithms for reaction template recommendation used in computer-aided synthesis planning software. Often, machine learning models designed to perform the task of prioritizing reaction templates or molecular transformations are focused on reporting high accuracy metrics for the one-to-one mapping of product molecules in reaction databases to the template extracted from the recorded reaction. The available templates th… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…Furthermore, we propose that this methodology can be extended to other specialized domains within synthesis planning tasks where the data may be limited and domain specific knowledge (i.e., a specialist) is required. The methodology could also be extended to combine various data sources to increase domain specific coverage, in addition to data augmentation techniques published at the time of writing this manuscript …”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, we propose that this methodology can be extended to other specialized domains within synthesis planning tasks where the data may be limited and domain specific knowledge (i.e., a specialist) is required. The methodology could also be extended to combine various data sources to increase domain specific coverage, in addition to data augmentation techniques published at the time of writing this manuscript …”
Section: Discussionmentioning
confidence: 99%
“…Atom-to-atom mapping (AAM) [6,7] is a procedure that establishes a correspondence between the atoms of reactants and products. AAM allows to identify a reaction centre (RC) which, in turn, helps to prepare reaction templates used in an automatized forward/retrosynthesis planning, [8][9][10][11][12] as well as to perform reaction classification [13] and reaction searching. [14,15] Several publicly and commercially available AAM tools are currently available.…”
Section: Introductionmentioning
confidence: 99%
“…Second, the amount of data is much greater than the domain knowledge of individual researchers. Therefore, with the recent rapid progress of deep learning, the use of machine learning algorithms to learn the latent space of retrosynthetic reaction rules has become a very active research area [1][2][3][4][5][6][7][8][9][10][11]. A rule-based model is a machine learning algorithm that learns the reaction rules that correspond to the input target molecules.…”
Section: Introductionmentioning
confidence: 99%
“…The reaction center means changing parts (atom and bonding) before and after the chemical reaction. Hence, the strength of rule-based models [1][2][3][4] is that it is easy to identify the selected reaction rules for the target molecule compared with the molecular transformer based on sequence-tosequence models and the attention mechanism [5][6][7][8][9].…”
Section: Introductionmentioning
confidence: 99%