The molecular dipole moment (µ) is a central quantity in chemistry. It is essential in predicting infrared and sum-frequency generation spectra, as well as induction and long-range electrostatic interactions. Furthermore, it can be extracted directly-via the ground state electron density-from high-level quantum mechanical calculations, making it an ideal target for machine learning (ML). In this work, we choose to represent this quantity with a physically inspired ML model that captures two distinct physical effects: local atomic polarization is captured within the symmetry-adapted Gaussian process regression (SA-GPR) framework, which assigns a (vector) dipole moment to each atom, while movement of charge across the entire molecule is captured by assigning a partial (scalar) charge to each atom. The resulting "MuML" models are fitted together to reproduce molecular µ computed using high-level coupled-cluster theory (CCSD) and density functional theory (DFT) on the QM7b dataset, achieving more accurate results due to the physics-based combination of these complementary terms. The combined model shows excellent transferability when applied to a showcase dataset of larger and more complex molecules, approaching the accuracy of DFT at a small fraction of the computational cost. We also demonstrate that the uncertainty in the predictions can be estimated reliably using a calibrated committee model. The ultimate performance of the models-and the optimal weighting of their combination-depend, however, on the details of the system at hand, with the scalar model being clearly superior when describing large molecules whose dipole is almost entirely generated by charge separation. These observations point to the importance of simultaneously accounting for the local and non-local effects that contribute to µ; further, they define a challenging task to benchmark future models, particularly those aimed at the description of condensed phases.
The predictive simulation of molecular liquids requires models that are not only accurate, but computationally efficient enough to handle the large systems and long time scales required for reliable prediction of macroscopic properties. We present a new approach to the systematic approximation of the first-principles potential energy surface (PES) of molecular liquids using the GAP (Gaussian Approximation Potential) framework. The approach allows us to create potentials at several different levels of accuracy in reproducing the true PES, which allows us to test the level of quantum chemistry that is necessary to accurately predict its macroscopic properties. We test the approach by building potentials for liquid methane (CH 4 ), which is difficult to model from first principles because its behavior is dominated by weak dispersion interactions with a significant many-body component. We find that an accurate, consistent prediction of its bulk density across a wide range of temperature and pressure requires not only many-body dispersion, but also quantum nuclear effects to be modeled accurately. * max.veit@epfl.ch; Current address:
In recent years, we have been witnessing a paradigm shift in computational materials science. In fact, traditional methods, mostly developed in the second half of the XXth century, are being complemented, extended, and sometimes even completely replaced by faster, simpler, and often more accurate approaches. The new approaches, that we collectively label by machine learning, have their origins in the fields of informatics and artificial intelligence, but are making rapid inroads in all other branches of science. With this in mind, this Roadmap article, consisting of multiple contributions from experts across the field, discusses the use of machine learning in materials science, and share perspectives on current and future challenges in problems as diverse as the prediction of materials properties, the construction of force-fields, the development of exchange correlation functionals for density-functional theory, the solution of the many-body problem, and more. In spite of the already numerous and exciting success stories, we are just at the beginning of a long path that will reshape materials science for the many challenges of the XXIth century.
Physically motivated and mathematically robust atom-centered representations of molecular structures are key to the success of modern atomistic machine learning. They lie at the foundation of a wide range of methods to predict the properties of both materials and molecules and to explore and visualize their chemical structures and compositions. Recently, it has become clear that many of the most effective representations share a fundamental formal connection. They can all be expressed as a discretization of n-body correlation functions of the local atom density, suggesting the opportunity of standardizing and, more importantly, optimizing their evaluation. We present an implementation, named librascal, whose modular design lends itself both to developing refinements to the density-based formalism and to rapid prototyping for new developments of rotationally equivariant atomistic representations. As an example, we discuss smooth overlap of atomic position (SOAP) features, perhaps the most widely used member of this family of representations, to show how the expansion of the local density can be optimized for any choice of radial basis sets. We discuss the representation in the context of a kernel ridge regression model, commonly used with SOAP features, and analyze how the computational effort scales for each of the individual steps of the calculation. By applying data reduction techniques in feature space, we show how to reduce the total computational cost by a factor of up to 4 without affecting the model's symmetry properties and without significantly impacting its accuracy.
Modeling ferroelectric materials from first principles is one of the successes of density-functional theory and the driver of much development effort, requiring an accurate description of the electronic processes and the thermodynamic equilibrium that drive the spontaneous symmetry breaking and the emergence of macroscopic polarization. We demonstrate the development and application of an integrated machine learning model that describes on the same footing structural, energetic, and functional properties of barium titanate (BaTiO3), a prototypical ferroelectric. The model uses ab initio calculations as a reference and achieves accurate yet inexpensive predictions of energy and polarization on time and length scales that are not accessible to direct ab initio modeling. These predictions allow us to assess the microscopic mechanism of the ferroelectric transition. The presence of an order-disorder transition for the Ti off-centered states is the main driver of the ferroelectric transition, even though the coupling between symmetry breaking and cell distortions determines the presence of intermediate, partly-ordered phases. Moreover, we thoroughly probe the static and dynamical behavior of BaTiO3 across its phase diagram without the need to introduce a coarse-grained description of the ferroelectric transition. Finally, we apply the polarization model to calculate the dielectric response properties of the material in a full ab initio manner, again reproducing the correct qualitative experimental behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.