De novo exploration and self-guided learning of potential-energy surfaces

Bernstein, Noam; Csänyi, Gábor; Deringer, Volker L.

doi:10.1038/s41524-019-0236-6

Cited by 188 publications

(172 citation statements)

References 86 publications

Supporting

Mentioning

163

Contrasting

Unclassified

Order By: Relevance

“…Robotics has spearheaded the efforts to build such data sets through the use of active learning 2 : building data sets by asking ML models to choose what data needs to be added to a training set to perform better next time. Although the concept of active learning originates from robotics, it has recently grown into an extremely important tool for collecting quantum chemistry data sets for use in ML applications [3][4][5][6][7][8][9][10][11] .…”

Section: Background and Summarymentioning

confidence: 99%

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

et al. 2020

View full text Add to dashboard Cite

Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models. In chemistry, ML has been used to develop models for predicting molecular properties, for example quantum mechanics (QM) calculated potential energy surfaces and atomic charge models. The ANI-1x and ANI-1ccx ML-based general-purpose potentials for organic molecules were developed through active learning; an automated data diversification process. Here, we describe the ANI-1x and ANI-1ccx data sets. To demonstrate data diversity, we visualize it with a dimensionality reduction scheme, and contrast against existing data sets. The ANI-1x data set contains multiple QM properties from 5 M density functional theory calculations, while the ANI-1ccx data set contains 500 k data points obtained with an accurate CCSD(T)/CBS extrapolation. Approximately 14 million CPU core-hours were expended to generate this data. Multiple QM calculated properties for the chemical elements C, H, N, and O are provided: energies, atomic forces, multipole moments, atomic charges, etc. We provide this data to the community to aid research and development of ML models for chemistry.

show abstract

Section: Background and Summarymentioning

confidence: 99%

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

et al. 2020

View full text Add to dashboard Cite

show abstract

“…The full details of the AL scheme are discussed in the "Methods" section. We also refer the reader to the recent success in the applications of AL [24][25][26] .…”

Section: Introductionmentioning

confidence: 99%

Machine-learned interatomic potentials by active learning: amorphous and liquid hafnium dioxide

et al. 2020

Self Cite

View full text Add to dashboard Cite

We propose an active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled with a Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO 2) dataset generated from a "melt-quench" ab initio molecular dynamics (AIMD) protocol. Our results show that the active learning scheme, with no prior knowledge of the dataset, is able to extract a configuration that reaches the required energy fit tolerance. Further, molecular dynamics (MD) simulations performed using this active learned GAP model on 6144 atom systems of amorphous and liquid state elucidate the structural properties of HfO 2 with near ab initio precision and quench rates (i.e., 1.0 K/ps) not accessible via AIMD. The melt and amorphous X-ray structural factors generated from our simulation are in good agreement with experiment. In addition, the calculated diffusion constants are in good agreement with previous ab initio studies.

show abstract

“…Structures were visualised using VESTA. 29 distance in any given structure is the same (here, 1.0Å) 24 an idea that originated in the eld of chemical topology. 25 This is a step of key importance, because otherwise the overlap of neighbour densities will be necessarily diminished as soon as there are different A-B distances ( Fig.…”

Section: Resultsmentioning

confidence: 99%

Understanding the geometric diversity of inorganic and hybrid frameworks through structural coarse-graining

2020

Self Cite

View full text Add to dashboard Cite

show abstract

De novo exploration and self-guided learning of potential-energy surfaces

Cited by 188 publications

References 86 publications

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

Machine-learned interatomic potentials by active learning: amorphous and liquid hafnium dioxide

Understanding the geometric diversity of inorganic and hybrid frameworks through structural coarse-graining

Contact Info

Product

Resources

About