Quantum Monte Carlo (QMC) is among the most accurate methods for solving the time independent Schrödinger equation. Unfortunately, the method is very expensive and requires a vast array of computing resources in order to obtain results of a reasonable convergence level. On the other hand, the method is not only easily parallelizable across CPU clusters, but as we report here, it also has a high degree of data parallelism. This facilitates the use of recent technological advances in Graphical Processing Units (GPUs), a powerful type of processor well known to computer gamers. In this paper we report on an end-to-end QMC application with core elements of the algorithm running on a GPU. With individual kernels achieving as much as 30x speed up, the overall application performs at up to 6x relative to an optimized CPU implementation, yet requires only a modest increase in hardware cost. This demonstrates the speedup improvements possible for QMC in running on advanced hardware, thus exploring a path toward providing QMC level accuracy as a more standard tool. The major current challenge in running codes of this type on the GPU arises from the lack of fully compliant IEEE floating point implementations. To achieve better accuracy we propose the use of the Kahan summation formula in matrix multiplications. While this drops overall performance, we demonstrate that the proposed new algorithm can match CPU single precision.
We present a technique for using quantum Monte Carlo ͑QMC͒ to obtain high quality energy differences. We use generalized valence bond ͑GVB͒ wave functions, for an intuitive approach to capturing the important sources of static correlation, without needing to optimize the orbitals with QMC. Using our modifications to Walker branching and Jastrows, we can then reliably use diffusion quantum Monte Carlo to add in all the dynamic correlation. This simple approach is easily accurate to within a few tenths of a kcal/mol for a variety of problems, which we demonstrate for the adiabatic singlet-triplet splitting in methylene, the vertical and adiabatic singlet-triplet splitting in ethylene, 2 + 2 cycloaddition, and Be 2 bond breaking.
We describe the Dynamic Distributable Decorrelation Algorithm (DDDA) which efficiently calculates the true statistical error of an expectation value obtained from serially correlated data "on-the-fly," as the calculation progresses. DDDA is an improvement on the Flyvbjerg-Petersen renormalization group blocking method (Flyvberg and Peterson, J Chem Phys 1989, 91, 461). This "on-the-fly" determination of statistical quantities allows dynamic termination of Monte Carlo calculations once a specified level of convergence is attained. This is highly desirable when the required precision might take days or months to compute, but cannot be accurately estimated prior to the calculation. Furthermore, DDDA allows for a parallel implementation which requires very low communication, O(log(2)N), and can also evaluate the variance of a calculation efficiently "on-the-fly." Quantum Monte Carlo calculations are presented to illustrate "on-the-fly" variance calculations for serial and massively parallel Monte Carlo calculations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.