Several pool-based active learning algorithms were employed to model potential energy surfaces (PESs) with a minimum number of electronic structure calculations. Among these algorithms, the class of uncertainty-based algorithms are popular. Their key principle is to query molecular structures corresponding to high uncertainties in their predictions. We empirically show that this strategy is not optimal for nonuniform data distributions as it collects many structures from sparsely sampled regions, which are less important to applications of the PES. We exploit a simple stochastic algorithm to correct for this behavior and implement it using regression trees, which have relatively small computational costs. We show that this algorithm requires around half the data to converge to the same accuracy than the uncertainty-based algorithm query-by-committee. Simulations on a 6D PES of pyrrole(H 2 O) show that < 15 000 configurations are enough to build a PES with a generalization error of 16 cm −1 , whereas the final model with around 50 000 configurations has a generalization error of 11 cm −1 .
Approximating functions by a linear span of truncated basis sets is a standard procedure for the numerical solution of differential and integral equations. Commonly used concepts of approximation methods are well‐posed and convergent, by provable approximation orders. On the down side, however, these methods often suffer from the curse of dimensionality, which limits their approximation behavior, especially in situations of highly oscillatory target functions. Nonlinear approximation methods, such as neural networks, were shown to be very efficient in approximating high‐dimensional functions. We investigate nonlinear approximation methods that are constructed by composing standard basis sets with normalizing flows. Such models yield richer approximation spaces while maintaining the density properties of the initial basis set, as we show. Simulations to approximate eigenfunctions of a perturbed quantum harmonic oscillator indicate convergence with respect to the size of the basis set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.