2022
DOI: 10.1021/acs.jctc.1c01264
|View full text |Cite
|
Sign up to set email alerts
|

Inclusion of More Physics Leads to Less Data: Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data

Abstract: Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the elec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(19 citation statements)
references
References 89 publications
1
18
0
Order By: Relevance
“…Therefore, this work is another example of how physics-based models with correct asymptotic behaviour can dramatically reduce the amount of data and free parameters required to achieve satisfactory predictive power of a model. 15,32 The Slater valence shells of MBIS partitioning approximately represent the typical decay of the electronic density and thus result in meaningful description of the electrostatic potential, despite it's simplicity. The QEq model captures the impact of all the chemical groups, even distant ones, on how the charge is distributed over the molecule.…”
Section: Discussionmentioning
confidence: 99%
“…Therefore, this work is another example of how physics-based models with correct asymptotic behaviour can dramatically reduce the amount of data and free parameters required to achieve satisfactory predictive power of a model. 15,32 The Slater valence shells of MBIS partitioning approximately represent the typical decay of the electronic density and thus result in meaningful description of the electrostatic potential, despite it's simplicity. The QEq model captures the impact of all the chemical groups, even distant ones, on how the charge is distributed over the molecule.…”
Section: Discussionmentioning
confidence: 99%
“…The descriptor used in EDDIE-ML was originally formulated for predicting interaction energy curves for dimer systems. The electron deformation density Δρ for a dimer A ··· B , defined as Δρ AB = ρ AB – ρ A – ρ B , was found in conjunction with a GPR model to differentiate accurately between dimer systems separated by increasing intermolecular distance, within 0.3 kcal mol –1 for neutral monomers . Briefly, the descriptor formulation involves projection of the electron deformation density Δρ onto atom-centered basis functions, ψ n l m ( r⃗ ) = Y l m ( θ , φ ) ζ n ( r ) where Y l m (θ,φ) are spherical harmonics and ζ n ( r ) are radial basis functions with a cutoff radius of r 0 : , ζ̃ n ( r ) = { lefttrue 1 N r 2 false( r 0 r false) n + 2 for r < r 0 0 else …”
Section: Theory and Methodsmentioning
confidence: 99%
“…GPR is a powerful interpolation method which has been widely used in the chemical machine learning community for predicting properties of materials and molecules. Here, we follow the methodology developed in our earlier work to train a GPR model to predict interaction energies, while introducing a new hybrid kernel function to better resolve between atomic and molecular environments.…”
Section: Theory and Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A key bottleneck hampering the broader applicability of QM-informed approaches is the presence of unique many-body symmetries necessitated by an explicit treatment on electron-electron interactions. Heuristic schemes have been used to enforce invariance (24,26,(30)(31)(32)(33) at a potential loss of information in their input features or expressivity in their ML models. Two objectives remain elusive for QM-informed ML: 1) incorporate the underlying physical symmetries with maximal data efficiency and model flexibility and 2) accurately infer downstream molecular properties for large chemical spaces at a computational resource requirement on par with existing empirical and Atomistic ML methods.…”
Section: Significancementioning
confidence: 99%