The computer system PASS provides simultaneous prediction of several hundreds of biological activity types for any drug-like compound. The prediction is based on the analysis of structure-activity relationships of the training set including more than 30000 known biologically active compounds. In this paper we investigate the influence on the accuracy of predicting the types of activity with PASS by (a) reduction of the number of structures in the training set and (b) reduction of the number of known activities in the training set. The compounds from the MDDR database are used to create heterogeneous training and evaluation sets. We demonstrate that predictions are robust despite the exclusion of up to 60% of information.
The application of the program PASS (Prediction of Activity Spectra for Substances) to about 250 000 compounds of the NCI Open Database and the incorporation of over 64 million PASS predictions in the Enhanced NCI Database Browser are described. A total of 565 different types of activity are included, encompassing general pharmacological effects, specific mechanisms of action, known toxicities, and others. Application of this Web-based service to prediction of activities of the kinds "Angiogenesis inhibitor," "Antiviral (HIV)", and a set of activities that can be associated with antineoplastic action are reported. For this latter data set, a very substantial enrichment over random selection was found in the PASS predictions. It is shown how the user can conduct complex searches by combining ranges of PASS-predicted probabilities of compounds to be active or to be inactive, respectively, with, e.g., value ranges of physicochemical parameters, presence or absence of particular substructural fragment, and other search criteria.
A new method for assessment of molecular similarity based on original description of chemical structure is discussed. The accuracy of similarity assessment obtained with this method is compared with that of the results of four other approaches. The same evaluation set is used to predict: (a) boiling point of 139 hydrocarbons and (b) mutagenicity of 15 nitrosamines. The results show that the proposed method provides reasonable appraisal for both properties, but prediction of mutagenicity is more accurate in this method as compared to the alternatives.
We present here a greatly updated version of an earlier study on the conformational energies of protein-ligand complexes in the Protein Data Bank (PDB) [Nicklaus et al. Bioorg. Med. Chem. 1995, 3, 411-428], with the goal of improving on all possible aspects such as number and selection of ligand instances, energy calculations performed, and additional analyses conducted. Starting from about 357,000 ligand instances deposited in the 2008 version of the Ligand Expo database of the experimental 3D coordinates of all small-molecule instances in the PDB, we created a "high-quality" subset of ligand instances by various filtering steps including application of crystallographic quality criteria and structural unambiguousness. Submission of 640 Gaussian 03 jobs yielded a set of about 415 successfully concluded runs. We used a stepwise optimization of internal degrees of freedom at the DFT level of theory with the B3LYP/6-31G(d) basis set and a single-point energy calculation at B3LYP/6-311++G(3df,2p) after each round of (partial) optimization to separate energy changes due to bond length stretches vs bond angle changes vs torsion changes. Even for the most "conservative" choice of all the possible conformational energies-the energy difference between the conformation in which all internal degrees of freedom except torsions have been optimized and the fully optimized conformer-significant energy values were found. The range of 0 to ~25 kcal/mol was populated quite evenly and independently of the crystallographic resolution. A smaller number of "outliers" of yet higher energies were seen only at resolutions above 1.3 Å. The energies showed some correlation with molecular size and flexibility but not with crystallographic quality metrics such as the Cruickshank diffraction-component precision index (DPI) and R(free)-R, or with the ligand instance-specific metrics such as occupancy-weighted B-factor (OWAB), real-space R factor (RSR), and real-space correlation coefficient (RSCC). We repeated these calculations with the solvent model IEFPCM, which yielded energy differences that were generally somewhat lower than the corresponding vacuum results but did not produce a qualitatively different picture. Torsional sampling around the crystal conformation at the molecular mechanics level using the MMFF94s force field typically led to an increase in energy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.