The dynamic nature and comparatively young age of computational chemistry is such that novel algorithms continue to be developed at a rapid pace. Such efforts are often wrought at the expense of extensive experimental validations of said techniques, preventing a deeper understanding of their potential utility and limitations. Here we address this issue for ligand-based virtual screening descriptors through design of validation experiments that better reflect the aims of real world application. Applying the newly defined chemotype enrichment approach, a variety of two- and three-dimensional (2D/3D) similarity descriptors have been compared extensively across data sets from four diverse target types. The inhibitors within said data sets contain molecules exhibiting a wide array of substructure functionality, size and flexibility, permitting descriptor comparison in myriad settings. Relative descriptor performance under these conditions is examined, including results obtained using more typical virtual screening validation experiments. Guidelines for optimal application of said descriptors are also discussed in the context of the results obtained, as is the potential utility of fingerprint filtering.
One of the early and effective approaches to G-coupled protein receptor target family library design was the analysis of a set of ligands for frequently occurring chemical moieties or substructures. Various methods ranging from frameworks analysis to pharmacophores have been employed to find these so-called target-family-privileged substructures. Although the use of these substructures is common practice in combinatorial library design and has produced leads, the methods used for finding them rarely verified their selectivity for the particular target family from which they were derived. The frequency of occurrence among ligands associated with a target receptor family is not a sufficient criterion for those substructures to receive the label of target-family-privileged substructure. This study explores the question of selectivity of ClassPharmer generated fragments for a series of target families: GPCRs, nuclear hormone receptors, serine proteases, protein kinases, and ligand-gated ion channels. In addition, a GPCR focused library and a random set of 10k compounds are examined in terms of their target-family-privileged-substructure composition. The results challenge the combinatorial chemistry concept of target-family-privileged substructures and suggest that many of these fragments may simply be drug-like or attractive for various receptors in accordance with the original definition of privileged substructures.
A kinome-wide selectivity screen of >20000 compounds with a rich representation of many structural classes has been completed. Analysis of the selectivity patterns for each class shows that a broad spectrum of structural scaffolds can achieve specificity for many kinase families. Kinase selectivity and potency are inversely correlated, a trend that is also found in a large set of kinase functional data. Although selective and nonselective compounds are mostly similar in their physicochemical characteristics, we identify specific features that are present more frequently in compounds that bind to many kinases. Our results support a scaffold-oriented approach for building compound collections to screen kinase targets.
The principle of bioisosterism-similarly shaped molecules are more likely to share biological properties than are other molecules-has long helped to guide drug discovery. An algorithmic implementation of this principle, based on shape comparisons of a single rule-generated "topomer" conformation per molecule, had been found to be the descriptor most consistently predictive of similar biological properties, in retrospective studies, and also to be well-suited for searching large (>10(12)) "virtual libraries" of potential reaction products. Therefore a prospective trial of this shape similarity searching method was carried out, with synthesis of 425 compounds and testing of them for inhibition of binding of angiotensin II (A-II). The 63 compounds that were identified by shape searching as most similar to any of four query structures included all of the seven compounds found to be highly active, with none of the other 362 structures being highly active (p < 0.001). Additional consistent relations (p < 0.05) were found, among all 425 compounds, between the degree of shape similarity to the nearest query structure and the frequency of various levels of observed activity. Known "SAR" (rules specifying structural features required for A-II antagonism) were also regenerated within the biological data for the 63 shape similar structures.
A novel Genetic Algorithm guided Selection method, GAS, has been described. The method utilizes a simple encoding scheme which can represent both compounds and variables used to construct a QSAR/QSPR model. A genetic algorithm is then utilized to simultaneously optimize the encoded variables that include both descriptors and compound subsets. The GAS method generates multiple models each applying to a subset of the compounds. Typically the subsets represent clusters with different chemotypes. Also a procedure based on molecular similarity is presented to determine which model should be applied to a given test set compound. The variable selection method implemented in GAS has been tested and compared using the Selwood data set (n = 31 compounds; v = 53 descriptors). The results showed that the method is comparable to other published methods. The subset selection method implemented in GAS has been first tested using an artificial data set (n = 100 points; v = 1 descriptor) to examine its ability to subset data points and second applied to analyze the XLOGP data set (n = 1831 compounds; v = 126 descriptors). The method is able to correctly identify artificial data points belonging to various subsets. The analysis of the XLOGP data set shows that the subset selection method can be useful in improving a QSAR/QSPR model when the variable selection method fails.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.