We propose a novel approach to deriving partial atomic charges from population analysis. The new model, called Charge Model 5 (CM5), yields class IV partial atomic charges by mapping from those obtained by Hirshfeld population analysis of density functional electronic charge distributions. The CM5 model utilizes a single set of parameters derived by fitting to reference values of the gas-phase dipole moments of 614 molecular structures. An additional test set (not included in the CM5 parametrization) contained 107 singly charged ions with nonzero dipole moments, calculated from the accurate electronic charge density, with respect to the center of nuclear charges. The CM5 model is applicable to any charged or uncharged molecule composed of any element of the periodic table in the gas phase or in solution. The CM5 model predicts dipole moments for the tested molecules that are more accurate on average than those from the original Hirshfeld method or from many other popular schemes including atomic polar tensor and Löwdin, Mulliken, and natural population analyses. In addition, the CM5 charge model is essentially independent of a basis set. It can be used with larger basis sets, and thereby this model significantly improves on our previous charge models CMx (x = 1-4 or 4M) and other methods that are prone to basis set sensitivity. CM5 partial atomic charges are less conformationally dependent than those derived from electrostatic potentials. The CM5 model does not suffer from ill conditioning for buried atoms in larger molecules, as electrostatic fitting schemes sometimes do. The CM5 model can be used with any level of electronic structure theory (Hartree-Fock, post-Hartree-Fock, and other wave function correlated methods or density functional theory) as long as an accurate electronic charge distribution and a Hirshfeld analysis can be computed for that level of theory.
With the advent of make-on-demand commercial libraries, the number of purchasable compounds available for virtual screening and assay has grown explosively in recent years, with several libraries eclipsing one billion compounds. Today’s screening libraries are larger and more diverse, enabling the discovery of more-potent hit compounds and unlocking new areas of chemical space, represented by new core scaffolds. Applying physics-based in silico screening methods in an exhaustive manner, where every molecule in the library must be enumerated and evaluated independently, is increasingly cost-prohibitive. Here, we introduce a protocol for machine learning-enhanced molecular docking based on active learning to dramatically increase throughput over traditional docking. We leverage a novel selection protocol that strikes a balance between two objectives: (1) identifying the best scoring compounds and (2) exploring a large region of chemical space, demonstrating superior performance compared to a purely greedy approach. Together with automated redocking of the top compounds, this method captures almost all the high scoring scaffolds in the library found by exhaustive docking. This protocol is applied to our recent virtual screening campaigns against the D4 and AMPC targets that produced dozens of highly potent, novel inhibitors, and a blind test against the MT1 target. Our protocol recovers more than 80% of the experimentally confirmed hits with a 14-fold reduction in compute cost, and more than 90% of the hit scaffolds in the top 5% of model predictions, preserving the diversity of the experimentally confirmed hit compounds.
We present a reliable and accurate solution to the induced fit docking problem for protein-ligand binding by combining ligand-based pharmacophore docking, rigid receptor docking, and protein structure prediction with explicit solvent molecular dynamics simulations. This novel methodology in detailed retrospective and prospective testing succeeded to determine proteinligand binding modes with a root-mean-square-deviation within 2.5 Å in over 90% of cross-docking cases. We further demonstrate these predicted ligand-receptor structures were sufficiently accurate to prospectively enable predictive structure-based drug discovery for challenging targets, substantially expanding the domain of applicability for such methods.
We have developed a new methodology for protein-ligand docking and scoring, WScore, incorporating a flexible description of explicit water molecules. The locations and thermodynamics of the waters are derived from a WaterMap molecular dynamics simulation. The water structure is employed to provide an atomic level description of ligand and protein desolvation. WScore also contains a detailed model for localized ligand and protein strain energy and integrates an MM-GBSA scoring component with these terms to assess delocalized strain of the complex. Ensemble docking is used to take into account induced fit effects on the receptor conformation, and protein reorganization free energies are assigned via fitting to experimental data. The performance of the method is evaluated for pose prediction, rank ordering of self-docked complexes, and enrichment in virtual screening, using a large data set of PDB complexes and compared with the Glide SP and Glide XP models; significant improvements are obtained.
In the hit identification stage of drug discovery, a diverse chemical space needs to be explored to identify initial hits. Contrary to empirical scoring functions, absolute protein−ligand binding free-energy perturbation (ABFEP) provides a theoretically more rigorous and accurate description of protein−ligand binding thermodynamics and could, in principle, greatly improve the hit rates in virtual screening. In this work, we describe an implementation of an accurate and reliable ABFEP method in FEP+. We validated the ABFEP method on eight congeneric compound series binding to eight protein receptors including both neutral and charged ligands. For ligands with net charges, the alchemical ion approach is adopted to avoid artifacts in electrostatic potential energy calculations. The calculated binding free energies correlate with experimental results with a weighted average of R 2 = 0.55 for the entire dataset. We also observe an overall root-mean-square error (RMSE) of 1.1 kcal/mol after shifting the zero-point of the simulation data to match the average experimental values. Through ABFEP calculations using apo versus holo protein structures, we demonstrated that the protein conformational and protonation state changes between the apo and holo proteins are the main physical factors contributing to the protein reorganization free energy manifested by the overestimation of raw ABFEP calculated binding free energies using the holo structures of the proteins. Furthermore, we performed ABFEP calculations in three virtual screening applications for hit enrichment. ABFEP greatly improves the hit rates as compared to docking scores or other methods like metadynamics. The good performance of ABFEP in rank ordering compounds demonstrated in this work confirms it as a useful tool to improve the hit rates in virtual screening, thus facilitating hit discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.