Ligand binding affinity prediction is one of the most important applications of computational chemistry. However, accurately ranking compounds with respect to their estimated binding affinities to a biomolecular target remains highly challenging. We provide an overview of recent work using molecular mechanics energy functions to address this challenge. We briefly review methods that use molecular dynamics and Monte Carlo simulations to predict absolute and relative ligand binding free energies, as well as our own work in which we have developed a physics-based scoring method that can be applied to hundreds of thousands of compounds by invoking a number of simplifying approximations. In our previous studies, we have demonstrated that our scoring method is a promising approach for improving the discrimination between ligands that are known to bind and those that are presumed not to, in virtual screening of large compound databases. In new results presented here, we explore several improvements to our computational method including modifying the dielectric constant used for the protein and ligand interiors, and empirically scaling energy terms to compensate for deficiencies in the energy model. Future directions for further improving our physics-based scoring method are also discussed.
We have developed a virtual ligand screening method designed to help assign enzymatic function for alpha-beta barrel proteins. We dock a library of ∼19,000 known metabolites against the active site and attempt to identify the relevant substrate based on predicted relative binding free energies. These energies are computed using a physics-based energy function based on an all-atom force field (OPLS-AA) and a generalized Born implicit solvent model. We evaluate the ability of this method to identify the known substrates of several members of the enolase superfamily of enzymes, including both holo and apo structures (11 total). The active sites of these enzymes contain numerous charged groups (lysines, carboxylates, histidines, and one or more metal ions) and thus provide a challenge for most docking scoring functions, which treat electrostatics and solvation in a highly approximate manner. Using the physics-based scoring procedure, the known substrate is ranked within the top 6% of the database in all cases, and in 8 of 11 cases, it is ranked within the top 1%. Moreover, the top-ranked ligands are strongly enriched in compounds with high chemical similarity to the substrate (e.g., different substitution patterns on a similar scaffold). These results suggest that our method can be used, in conjunction with other information including genomic context and known metabolic pathways, to suggest possible substrates or classes of substrates for experimental testing. More broadly, the physics-based scoring method performs well on highly charged binding sites and is likely to be useful in inhibitor docking against polar binding sites as well. The method is fast (<1 min per ligand), due largely to an efficient minimization algorithm based on the truncated Newton method, and thus, it can be applied to thousands of ligands within a few hours on a small Linux cluster.Computational ligand screening ("virtual screening" or "docking") is widely used in structure-based drug design projects to rapidly and inexpensively identify lead compounds (1-3). Here, we consider a different application of docking methods, namely to assist in the identification of possible substrates of an enzyme, when the function of the enzyme is unknown. We expect this capability to become increasingly important as the "production phase" of structural genomics efforts gets under way. These projects are expected to generate thousands of protein structures, with the ultimate goal of providing structural representatives of the majority of protein families. However, knowing the structure of a protein does not always uniquely or unambiguously suggest its function. Although the term "function" can encompass a broad range of meanings, here we refer specifically to the reactions that an enzyme can catalyze in vivo.The alpha-beta barrel enzymes, a subset of which is considered in this paper, pose a particular challenge for functional annotation. The basic alpha-beta barrel scaffold of eight parallel strands forming a barrel, flanked by eight helices, is know...
We propose a sampling scheme to reduce the CPU time for Monte Carlo simulations of atomic systems. Our method is based on the separation of the potential energy into parts that are expected to vary at different rates as a function of coordinates. We perform n moves that are accepted or rejected according to the rapidly varying part of the potential, and the resulting configuration is accepted or rejected according to the slowly varying part. We test our method on a Lennard-Jones system. We show that use of our method leads to significant savings in CPU time. We also show that for moderate system sizes the scaling of CPU time with system size can be improved ͑for nϭ40 the scaling is predominantly linear up to 1000 particles͒.
The atomic-level mechanisms of protein regulation by post-translational phosphorylation remain poorly understood, except in a few well-studied systems. Molecular mechanics simulations can in principle be used to help understand and predict the effects of protein phosphorylation, but the accuracy of the results will of course depend on the quality of the force field parameters for the phosphorylated residues as well as the quality of the solvent model. The phosphorylated residues typically carry a -2 charge at physiological pH; however, the effects of phosphorylation can sometimes be mimicked by substituting Asp or Glu for the phosphorylated residue. Here we examine the suitability of explicit and implicit solvent models for simulating phospho-serine in both the -1 and -2 charge states. Specifically, we simulate a capped phosphorylated peptide, Ace-Gly-Ser-pSer-Ser-Nme, and compare the results to each other and to experimental observables from an NMR experiment. The first major conclusion is that explicit water models (TIP3P, TIP4P and SPC/E) and a Generalized Born implicit solvent model provide reasonable agreement with the experimental observables, given appropriate partial charges for the phosphate group. The Generalized Born results, however, show greater hydrogen bonding propensity than the explicit solvent results. Distance dependent dielectric treatments perform poorly. The second major conclusion is that many ensemble-averaged properties obtained for the phosphopeptide in the -1 and -2 charge states are strikingly similar; the -1 species has a slightly higher propensity to form internal hydrogen bonds. All of the results can be rationalized by quantifying the strength of the P-O/H-N hydrogen bond, which depends on a sensitive balance between strongly favorable charge/dipole and dipole/dipole interactions and strongly unfavorable desolvation.
Motivated by their participation in the McMaster Data-Mining and Docking Competition, the authors developed 2 new computational technologies and applied them to docking against Escherichia coli dihydrofolate reductase: a receptor preparation procedure that incorporates rotamer optimization of side chains and a physics-based rescoring procedure for estimating relative binding affinities of the protein-ligand complexes. Both methods use the same energy function, consisting of the all-atom OPLS-AA force field and a generalized Born solvent model, which treats the protein receptor and small-molecule ligands in a consistent manner. Thus, the energy function is similar to that used in more sophisticated approaches, such as free-energy perturbation and the molecular mechanics Poisson-Boltzmann/surface area, but sampling during the rescoring procedure is limited to simple energy minimization of the ligand. The use of a highly efficient minimization algorithm permitted the authors to apply this rescoring procedure to hundreds of thousands of protein-ligand complexes during the competition, using a modest Linux cluster. To test these methods, they used the 12 competitive inhibitors identified in the training set, plus methotrexate, as positive controls in enrichment studies with both the training and test sets, each containing 50,000 compounds. The key conclusion is that combining the receptor preparation and rescoring methods makes it possible to identify most of the positive controls within the top few tenths of a percent of the rank-ordered training and test set libraries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.