The biochemical half maximal inhibitory concentration (IC50) is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions. For large scale analysis it is not feasible to check each data entry manually and it is very tempting to mix all available IC50 values from public database even if assay information is not reported. As previously reported for Ki database analysis, we first analyzed the types of errors, the redundancy and the variability that can be found in ChEMBL IC50 database. For assessing the variability of IC50 data independently measured in two different labs at least ten IC50 data for identical protein-ligand systems against the same target were searched in ChEMBL. As a not sufficient number of cases of this type are available, the variability of IC50 data was assessed by comparing all pairs of independent IC50 measurements on identical protein-ligand systems. The standard deviation of IC50 data is only 25% larger than the standard deviation of Ki data, suggesting that mixing IC50 data from different assays, even not knowing assay conditions details, only adds a moderate amount of noise to the overall data. The standard deviation of public ChEMBL IC50 data, as expected, resulted greater than the standard deviation of in-house intra-laboratory/inter-day IC50 data. Augmenting mixed public IC50 data by public Ki data does not deteriorate the quality of the mixed IC50 data, if the Ki is corrected by an offset. For a broad dataset such as ChEMBL database a Ki- IC50 conversion factor of 2 was found to be the most reasonable.
The maximum achievable accuracy of in silico models depends on the quality of the experimental data. Consequently, experimental uncertainty defines a natural upper limit to the predictive performance possible. Models that yield errors smaller than the experimental uncertainty are necessarily overtrained. A reliable estimate of the experimental uncertainty is therefore of high importance to all originators and users of in silico models. The data deposited in ChEMBL was analyzed for reproducibility, i.e., the experimental uncertainty of independent measurements. Careful filtering of the data was required because ChEMBL contains unit-transcription errors, undifferentiated stereoisomers, and repeated citations of single measurements (90% of all pairs). The experimental uncertainty is estimated to yield a mean error of 0.44 pK(i) units, a standard deviation of 0.54 pK(i) units, and a median error of 0.34 pK(i) units. The maximum possible squared Pearson correlation coefficient (R(2)) on large data sets is estimated to be 0.81.
We validate an automated implementation of a combined Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) method (VSGB 2.0 energy model) on a large and diverse selection of protein-ligand complexes (855 complexes). Although this data set is diverse with respect to both protein families and ligands, after carefully removing flawed structures, a significant correlation (R(2) = 0.63) between calculated and experimental binding affinities is obtained. Consistent explanations for "outlier" complexes are found. Visual analysis of the crystal structures and recourse to the original literature reveal that neglect of explicit solvent, ligand strain, and entropy contribute to the under- and overestimation of computed affinities. The limits of the Molecular Mechanics/Implicit Solvent approach to accurately estimate protein-ligand binding affinities is discussed as is the influence of the quality of protein-ligand complexes on computed free energy binding values.
We introduce a spectroscopic method that determines nonlinear quantum mechanical response functions beyond the optical diffraction limit and allows direct imaging of nanoscale coherence. In established coherent two-dimensional (2D) spectroscopy, four-wave-mixing responses are measured using three ingoing waves and one outgoing wave; thus, the method is diffraction-limited in spatial resolution. In coherent 2D nanoscopy, we use four ingoing waves and detect the final state via photoemission electron microscopy, which has 50-nanometer spatial resolution. We recorded local nanospectra from a corrugated silver surface and observed subwavelength 2D line shape variations. Plasmonic phase coherence of localized excitations persisted for about 100 femtoseconds and exhibited coherent beats. The observations are best explained by a model in which coupled oscillators lead to Fano-like resonances in the hybridized dark- and bright-mode response.
Multipole (MTP) electrostatics provides the means to describe anisotropic interactions in a rigorous and systematic manner. A number of earlier molecular dynamics (MD) implementations have increasingly relied on the use of molecular symmetry to reduce the (possibly large) number of MTP interactions. Here, we present a CHARMM implementation of MTP electrostatics in terms of spherical harmonics. By relying on a systematic set of reference-axis systems tailored to various chemical environments, we obtain an implementation that is both efficient and scalable for (bio)molecular systems. We apply the method to a series of halogenated compounds to show (i) energy conservation; (ii) improvements in reproducing thermodynamic properties compared to standard point-charge (PC) models; (iii) performance of the code; and (iv) better stabilization of a brominated ligand in a target protein, compared to a PC force field. The implementation provides interesting perspectives toward a dual PC/MTP resolution, à la QM/MM.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.