We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 smallmolecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKSnative Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an opensource and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein−ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.
The restrained electrostatic potential (RESP) approach is a highly regarded and widely used method of assigning partial charges to molecules for simulations. RESP uses a quantummechanical method that yields fortuitous overpolarization and thereby accounts only approximately for self-polarization of molecules in the condensed phase. Here we present RESP2, a next generation of this approach, where the polarity of the charges is tuned by a parameter, δ, which scales the contributions from gas-and aqueous-phase calculations. When the complete non-bonded force field model, including Lennard-Jones parameters, is optimized to liquid properties, improved accuracy is achieved, even with this reduced set of five Lennard-Jones types. We argue that RESP2 with δ ≈ 0.6 (60% aqueous, 40% gas-phase charges) is an accurate and robust method of generating partial charges, and that a small set of Lennard-Jones types is a good starting point for a systematic re-optimization of this important non-bonded term.
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, codenamed Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein− ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔH mix , ρ(x), ΔG solv , and ΔG trans . Additionally, we benchmarked against protein−ligand binding free energies (ΔG bind ), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
The parameterization of torsional / dihedral angle potential energy terms is a crucial part of developing molecular mechanics force fields.Quantum mechanical (QM) methods are often used to provide samples of the potential energy surface (PES) for fitting the empirical parameters in these force field terms.To ensure that the sampled molecular configurations are thermodynamically feasible, constrained QM geometry optimizations are typically carried out, which relax the orthogonal degrees of freedom while fixing the target torsion angle(s) on a grid of values.However, the quality of results and computational cost are affected by various factors on a non-trivial PES, such as dependence on the chosen scan direction and the lack of efficient approaches to integrate results started from multiple initial guesses.In this paper we propose a systematic and versatile workflow called \textit{TorsionDrive} to generate energy-minimized structures on a grid of torsion constraints by means of a recursive wavefront propagation algorithm, which resolves the deficiencies of conventional scanning approaches and generates higher quality QM data for force field development.The capabilities of our method are presented for multi-dimensional scans and multiple initial guess structures, and an integration with the MolSSI QCArchive distributed computing ecosystem is described.The method is implemented in an open-source software package that is compatible with many QM software packages and energy minimization codes. File list (2)download file view on ChemRxiv TorsionDrive_manuscript_submitted.pdf (3.57 MiB) download file view on ChemRxiv TorsionDrive_SI.zip (5.32 MiB)
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this paper we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root mean square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔ𝐸). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule biomolecular force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as Δ𝐻_𝑚𝑖𝑥, ρ(𝑥), Δ𝐺_𝑠𝑜𝑙𝑣 and Δ𝐺_𝑡𝑟𝑎𝑛𝑠. Additionally, we benchmarked against protein-ligand binding free energies (Δ𝐺_𝑏𝑖𝑛𝑑), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.