2020
DOI: 10.1038/s41597-020-00746-1
|View full text |Cite
|
Sign up to set email alerts
|

QM-symex, update of the QM-sym database with excited state information for 173 kilo molecules

Abstract: In the research field of material science, quantum chemistry database plays an indispensable role in determining the structure and properties of new material molecules and in deep learning in this field. A new quantum chemistry database, the QM-sym, has been set up in our previous work. The QM-sym is an open-access database focusing on transition states, energy, and orbital symmetry. In this work, we put forward the QM-symex with 173-kilo molecules. Each organic molecular in the QM-symex combines with the Cnh … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(15 citation statements)
references
References 26 publications
0
14
0
Order By: Relevance
“…Recently, another dataset QM-symex with excited-state properties for ten excitations of 173,000 symmetric molecules was reported. 101 As a solution to the limited data availability, ML models are often developed and trained on much more computationally affordable bandgaps or orbital energies from which bandgaps can be calculated. 85,95,98,[102][103][104][105][106][107][108][109][110][111][112][113][114][115] Such studies are also facilitated by the availability of many big databases with these properties.…”
Section: [H2] Reference Datamentioning
confidence: 99%
“…Recently, another dataset QM-symex with excited-state properties for ten excitations of 173,000 symmetric molecules was reported. 101 As a solution to the limited data availability, ML models are often developed and trained on much more computationally affordable bandgaps or orbital energies from which bandgaps can be calculated. 85,95,98,[102][103][104][105][106][107][108][109][110][111][112][113][114][115] Such studies are also facilitated by the availability of many big databases with these properties.…”
Section: [H2] Reference Datamentioning
confidence: 99%
“…First, we require large chemical diversity in our training set, and most of the existing molecular excited state databases use TD-DFT. 24,[37][38][39][40] The largest databases, namely PubChemQC 38 and QM-symex, 41 use B3LYP. Second, B3LYP was used in previous works using linearly calibrated xTB-sTDA 15,16 and is used extensively in machine learning and high-throughput screening studies.…”
Section: A Reference Computational Techniquementioning
confidence: 99%
“…Specifically, the training sets for the ML models considered in this study were derived from the existing PubChemQC (PCQC) 38 and QM-symex 40 databases. For concision, we will use the (functional/basis set) notation to describe the level of theory used in calculations.…”
Section: B Training Datasetmentioning
confidence: 99%
“…224 The QM8 226,227 includes 20 000 synthetically available small molecules, and provides electronic structure values using both TD-DFT and CC2. The QM-symex 228 has 173 000 compounds with excited state energies calculated at the B3LYP/6-31G level, and provides particular emphasis on molecular symmetry. Such databases could be used for the calibration of lower-level methods, e.g.…”
Section: B Excited State Energiesmentioning
confidence: 99%