2020
DOI: 10.26434/chemrxiv.12609899.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Photoswitch Dataset: A Molecular Machine Learning Benchmark for the Advancement of Synthetic Chemistry

Abstract: The space of synthesizable molecules is greater than $10^{60}$, meaning only a vanishingly small fraction of these molecules have ever been realized in the lab. In order to prioritize which regions of this space to explore next, synthetic chemists need access to accurate molecular property predictions. While great advances in molecular machine learning have been made, there is a dearth of benchmarks featuring properties that are useful for the synthetic chemist. Focussing directly on the needs of the s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 78 publications
(108 reference statements)
0
16
0
Order By: Relevance
“…This point becomes particularly relevant if one considers that KM-GD models (KRR- In the case of kernel models, specialized software programs have been developed to enable such an efficiency-boosting. 103,104 However, for comparing timings on the ethanol data set, we performed all simulations on CPUs, using the same computer hardware architecture and number of cores to allow for a fair comparison between different models, alleviating possible dependencies of the computational timings concerning specific hardware configuration.…”
Section: Training Pes On Energies Onlymentioning
confidence: 99%
“…This point becomes particularly relevant if one considers that KM-GD models (KRR- In the case of kernel models, specialized software programs have been developed to enable such an efficiency-boosting. 103,104 However, for comparing timings on the ethanol data set, we performed all simulations on CPUs, using the same computer hardware architecture and number of cores to allow for a fair comparison between different models, alleviating possible dependencies of the computational timings concerning specific hardware configuration.…”
Section: Training Pes On Energies Onlymentioning
confidence: 99%
“…The problems of TDDFT have been discussed very recently by Thawani et al. 548 The authors developed a data set for relevant photoswitches, which are useful, e.g., for medical applications or renewable energy technologies. To this aim, photochemical properties of azobenzenes and associated derivatives were manually extracted from experimental papers.…”
Section: Data Sets For Excited Statesmentioning
confidence: 99%
“…Moreover, ML can also learn experimental intermediate properties, such as the absorption wavelength of photoswitch molecules. 215 QM methods could also be used to calculate this property (and ML would be then used as a surrogate model for QM), but these methods can be computationally demanding and can lead to less accurate data than those obtained with experimental measurments. 215 Learning intermediate properties may also provide in-depth theoretical insight into the photophysical processes that takes place in the material and that can be experimentally measured.…”
Section: [H2] Learning Intermediate Propertiesmentioning
confidence: 99%
“…215 QM methods could also be used to calculate this property (and ML would be then used as a surrogate model for QM), but these methods can be computationally demanding and can lead to less accurate data than those obtained with experimental measurments. 215 Learning intermediate properties may also provide in-depth theoretical insight into the photophysical processes that takes place in the material and that can be experimentally measured. Therefore, ML can use experimental data as input and predict important intermediate properties, thus directly linking experiments to the theoretical description.…”
Section: [H2] Learning Intermediate Propertiesmentioning
confidence: 99%
See 1 more Smart Citation