2023
DOI: 10.1038/s41597-023-02366-x
|View full text |Cite
|
Sign up to set email alerts
|

Atomic structures, conformers and thermodynamic properties of 32k atmospheric molecules

Abstract: Low-volatile organic compounds (LVOCs) drive key atmospheric processes, such as new particle formation (NPF) and growth. Machine learning tools can accelerate studies of these phenomena, but extensive and versatile LVOC datasets relevant for the atmospheric research community are lacking. We present the GeckoQ dataset with atomic structures of 31,637 atmospherically relevant molecules resulting from the oxidation of α-pinene, toluene and decane. For each molecule, we performed comprehensive conformer sampling … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 54 publications
0
5
0
Order By: Relevance
“…NNs have been used to model large sulfuric acid–dimethylamine clusters and the NN potential ANI-2x has been benchmarked for small dimer clusters . KRR/GPR has been used to predict cluster binding energies, , saturation vapor pressures of organic molecules, , and chemical potentials of organic molecules in atmospherically relevant solutions …”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…NNs have been used to model large sulfuric acid–dimethylamine clusters and the NN potential ANI-2x has been benchmarked for small dimer clusters . KRR/GPR has been used to predict cluster binding energies, , saturation vapor pressures of organic molecules, , and chemical potentials of organic molecules in atmospherically relevant solutions …”
Section: Methodsmentioning
confidence: 99%
“…48 NNs have been used to model large sulfuric acid−dimethylamine clusters 47 and the NN potential ANI-2x 91 has been benchmarked for small dimer clusters. 92 KRR/GPR has been used to predict cluster binding energies, [44][45][46]61 saturation vapor pressures of organic molecules, 41,42 and chemical potentials of organic molecules in atmospherically relevant solutions. 43 Our ML-oriented subpackage, JKML, offers an interface between the JKQC-constructed database files (e.g., those stored in ACDB 2.0) and two ML programs, quantum machine learning (QML 93 ) and SchNetPack.…”
Section: Usedmentioning
confidence: 99%
See 1 more Smart Citation
“…Figure shows a first visualization of this overlap. The figure presents a t‐stochastic neighborhood embedding (t‐SNE) analysis for three atmospheric molecular datasets (here referred to as Gecko, [ 165,166 ] Wang, [ 167 ] and Quinones [ 168,169 ] ) and two datasets of drug and metabolite compounds, representative of those in mass spectral databases (nablaDFT [ 170,171 ] and Massbank of North America [ 103 ] ). t‐SNE clustered the compounds according to the similarity of their (molecular) topological fingerprints.…”
Section: Toward Data‐driven Compound Identification In Atmospheric Ma...mentioning
confidence: 99%
“…In addition, community datasets, such as refs. [68, 166, 187, 188], could complement data infrastructures. They offer distinct advantages such as having been purposefully curated with design criteria like similarity and balance in mind.…”
Section: Toward Data‐driven Compound Identification In Atmospheric Ma...mentioning
confidence: 99%