In silico methods of phenotypic screening are necessary to reduce the time and cost of the experimental in vivo screening of anticancer agents through dozens of millions of natural and synthetic chemical compounds. We used the previously developed PASS (Prediction of Activity Spectra for Substances) algorithm to create and validate the classification SAR models for predicting the cytotoxicity of chemicals against different types of human cell lines using ChEMBL experimental data. A training set from 59,882 structures of compounds was created based on the experimental data (IG50, IC50, and % inhibition values) from ChEMBL. The average accuracy of prediction (AUC) calculated by leave-one-out and a 20-fold cross-validation procedure during the training was 0.930 and 0.927 for 278 cancer cell lines, respectively, and 0.948 and 0.947 for cytotoxicity prediction for 27 normal cell lines, respectively. Using the given SAR models, we developed a freely available web-service for cell-line cytotoxicity profile prediction (CLC-Pred: Cell-Line Cytotoxicity Predictor) based on the following structural formula: http://way2drug.com/Cell-line/.
An essential characteristic of chemical compounds is their biological activity since its presence can become the basis for the use of the substance for therapeutic purposes, or, on the contrary, limit the possibilities of its practical application due to the manifestation of side action and toxic effects. Computer assessment of the biological activity spectra makes it possible to determine the most promising directions for the study of the pharmacological action of particular substances, and to filter out potentially dangerous molecules at the early stages of research. For more than 25 years, we have been developing and improving the computer program PASS (Prediction of Activity Spectra for Substances), designed to predict the biological activity spectrum of substance based on the structural formula of its molecules. The prediction is carried out by the analysis of structure-activity relationships for the training set, which currently contains information on structures and known biological activities for more than one million molecules. The structure of the organic compound is represented in PASS using Multilevel Neighborhoods of Atoms descriptors; the activity prediction for new compounds is performed by the naive Bayes classifier and the structure-activity relationships determined by the analysis of the training set. We have created and improved both local versions of the PASS program and freely available web resources based on PASS (http://www.way2drug.com). They predict several thousand biological activities (pharmacological effects, molecular mechanisms of action, specific toxicity and adverse effects, interaction with the unwanted targets, metabolism and action on molecular transport), cytotoxicity for tumor and non-tumor cell lines, carcinogenicity, induced changes of gene expression profiles, metabolic sites of the major enzymes of the first and second phases of xenobiotics biotransformation, and belonging to substrates and/or metabolites of metabolic enzymes. The web resource Way2Drug is used by over 18,000 researchers from more than 90 countries around the world, which allowed them to obtain over 600,000 predictions and publish about 500 papers describing the obtained results. The analysis of the published works shows that in some cases the interpretation of the prediction results presented by the authors of these publications requires an adjustment. In this work, we provide the theoretical basis and consider, on particular examples, the opportunities and limitations of computer-aided prediction of biological activity spectra.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.