A total of 21 833 inhibitors of the central nervous system (CNS) were collected from Binding-database and analyzed using discriminant analysis (DA) techniques. A combination of genetic algorithm and quadratic discriminant analysis (GA-QDA) was proposed as a tool for the classification of molecules based on their therapeutic targets and activities. The results indicated that the one-against-one (OAO) QDA classifiers correctly separate the molecules based on their therapeutic targets and are comparable with support vector machines. These classifiers help in charting the chemical space of the CNS inhibitors and finding specific subspaces occupied by particular classes of molecules. As a next step, the classification models were used as virtual filters for screening of random subsets of PUBCHEM and ZINC databases. The calculated enrichment factors together with the area under curve values of receiver operating characteristic curves showed that these classifiers are good candidates to speed up the early stages of drug discovery projects. The "relative distances" of the center of active classes of biosimilar molecules calculated by OAO classifiers were used as indices for sorting the compound databases. The results revealed that, the multiclass classification models in this work circumvent the definition inactive sets for virtual screening and are useful for compound retrieval analysis in Chemoinformatics.
This paper introduces the algorithms, implementation strategies, features, and applications of CS-MINER, a tool for visualization and analysis of drug-like chemical space. The CS-MINER is the abstract abbreviation for Chemical Space Miner and correlates the medicinal target space and chemical space, in a systematic way. The database in this software consists of a large collection of drug-like molecules. To prepare this database, a large number of molecules for 110 important biological targets were collected from Binding-DB. A total of 1497 physicochemical properties were calculated for each molecule. The CS-MINER uses the discriminant analysis techniques for tracing the collected data and finally separates the molecules based on their therapeutic targets and activities. The developed multivariate classifiers can be used for ligand-based virtual screening of more than 0.5 million random molecules of PubChem and ZINC databases. In order to validate the models, selected subspaces in CS-MINER were compared with DrugBank molecules. At the end of the analysis, the software provides an interactive environment for visualization of the selected chemical subspaces in the form of 2- and 3-dimensional plots. In general, CS-MINER is a tool for comparing the relative position of active biosimilar molecules in chemical space and is freely available at www.csminer.com.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.