Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models (http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/).
Drug-induced liver injury (DILI) is a major safety concern in the drug-development process, and various methods have been proposed to predict the hepatotoxicity of compounds during the early stages of drug trials. In this study, we developed an ensemble model using 3 machine learning algorithms and 12 molecular fingerprints from a dataset containing 1241 diverse compounds. The ensemble model achieved an average accuracy of 71.1 ± 2.6%, sensitivity (SE) of 79.9 ± 3.6%, specificity (SP) of 60.3 ± 4.8%, and area under the receiver-operating characteristic curve (AUC) of 0.764 ± 0.026 in 5-fold cross-validation and an accuracy of 84.3%, SE of 86.9%, SP of 75.4%, and AUC of 0.904 in an external validation dataset of 286 compounds collected from the Liver Toxicity Knowledge Base. Compared with previous methods, the ensemble model achieved relatively high accuracy and SE. We also identified several substructures related to DILI. In addition, we provide a web server offering access to our models (http://ccsipb.lnu.edu.cn/toxicity/HepatoPred-EL/).
BackgroundAlzheimer's disease (AD) is the most common cause of dementia characterized by progressive cognitive impairment in the elderly people. The most dramatic abnormalities are those of the cholinergic system. Acetylcholinesterase (AChE) plays a key role in the regulation of the cholinergic system, and hence, inhibition of AChE has emerged as one of the most promising strategies for the treatment of AD.MethodsIn this study, we suggest a workflow for the identification and prioritization of potential compounds targeted against AChE. In order to elucidate the essential structural features for AChE, three-dimensional pharmacophore models were constructed using Discovery Studio 2.5.5 (DS 2.5.5) program based on a set of known AChE inhibitors.ResultsThe best five-features pharmacophore model, which includes one hydrogen bond donor and four hydrophobic features, was generated from a training set of 62 compounds that yielded a correlation coefficient of R = 0.851 and a high prediction of fit values for a set of 26 test molecules with a correlation of R2 = 0.830. Our pharmacophore model also has a high Güner-Henry score and enrichment factor. Virtual screening performed on the NCI database obtained new inhibitors which have the potential to inhibit AChE and to protect neurons from Aβ toxicity. The hit compounds were subsequently subjected to molecular docking and evaluated by consensus scoring function, which resulted in 9 compounds with high pharmacophore fit values and predicted biological activity scores. These compounds showed interactions with important residues at the active site.ConclusionsThe information gained from this study may assist in the discovery of potential AChE inhibitors that are highly selective for its dual binding sites.
Toxicity evaluation is an important part of the preclinical safety assessment of new drugs, which is directly related to human health and the fate of drugs. It is of importance to study how to evaluate drug toxicity accurately and economically. The traditional in vitro and in vivo toxicity tests are laborious, time-consuming, highly expensive, and even involve animal welfare issues. Computational methods developed for drug toxicity prediction can compensate for the shortcomings of traditional methods and have been considered useful in the early stages of drug development. Numerous drug toxicity prediction models have been developed using a variety of computational methods. With the advance of the theory of machine learning and molecular representation, more and more drug toxicity prediction models are developed using a variety of machine learning methods, such as support vector machine, random forest, naive Bayesian, back propagation neural network. And significant advances have been made in many toxicity endpoints, such as carcinogenicity, mutagenicity, and hepatotoxicity. In this review, we aimed to provide a comprehensive overview of the machine learning based drug toxicity prediction studies conducted in recent years. In addition, we compared the performance of the models proposed in these studies in terms of accuracy, sensitivity, and specificity, providing a view of the current state-of-the-art in this field and highlighting the issues in the current studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.