Drug-induced liver injury (DILI) is a major issue for both patients and pharmaceutical industry due to insufficient means of prevention/prediction. In the current work we present a 2-class classification model for DILI, generated with Random Forest and 2D molecular descriptors on a dataset of 966 compounds. In addition, predicted transporter inhibition profiles were also included into the models. The initially compiled dataset of 1773 compounds was reduced via a 2-step approach to 966 compounds, resulting in a significant increase (p-value<0.05) in model performance. The models have been validated via 10-fold cross-validation and against three external test sets of 921, 341 and 96 compounds, respectively. The final model showed an accuracy of 64% (AUC 68%) for 10-fold cross-validation (average of 50 iterations) and comparable values for two test sets (AUC 59%, 71% and 66%, respectively). In the study we also examined whether the predictions of our in-house transporter inhibition models for BSEP, BCRP, P-glycoprotein, and OATP1B1 and 1B3 contributed in improvement of the DILI mode. Finally, the model was implemented with open-source 2D RDKit descriptors in order to be provided to the community as a Python script.
Cholestasis represents one out of three types of drug induced liver injury (DILI), which comprises a major challenge in drug development. In this study we applied a two-class classification scheme based on k-nearest neighbors in order to predict cholestasis, using a set of 93 two-dimensional (2D) physicochemical descriptors and predictions of selected hepatic transporters’ inhibition (BSEP, BCRP, P-gp, OATP1B1, and OATP1B3). In order to assess the potential contribution of transporter inhibition, we compared whether the inclusion of the transporters’ inhibition predictions contributes to a significant increase in model performance in comparison to the plain use of the 93 2D physicochemical descriptors. Our findings were in agreement with literature findings, indicating a contribution not only from BSEP inhibition but a rather synergistic effect deriving from the whole set of transporters. The final optimal model was validated via both 10-fold cross validation and external validation. It performs quite satisfactorily resulting in 0.686 ± 0.013 for accuracy and 0.722 ± 0.014 for area under the receiver operating characteristic curve (AUC) for 10-fold cross-validation (mean ± standard deviation from 50 iterations).
Cheminformatics datasets used in classification problems, especially those related to biological or physicochemical properties, are often imbalanced. This presents a major challenge in development of in silico prediction models, as the traditional machine learning algorithms are known to work best on balanced datasets. The class imbalance introduces a bias in the performance of these algorithms due to their preference towards the majority class. Here, we present a comparison of the performance of seven different meta-classifiers for their ability to handle imbalanced datasets, whereby Random Forest is used as base-classifier. Four different datasets that are directly (cholestasis) or indirectly (via inhibition of organic anion transporting polypeptide 1B1 and 1B3) related to liver toxicity were chosen for this purpose. The imbalance ratio in these datasets ranges between 4:1 and 20:1 for negative and positive classes, respectively. Three different sets of molecular descriptors for model development were used, and their performance was assessed in 10-fold cross-validation and on an independent validation set. Stratified bagging, MetaCost and CostSensitiveClassifier were found to be the best performing among all the methods. While MetaCost and CostSensitiveClassifier provided better sensitivity values, Stratified Bagging resulted in high balanced accuracies.Graphical Abstract Electronic supplementary materialThe online version of this article (10.1007/s10822-018-0116-z) contains supplementary material, which is available to authorized users.
Organic anion transporting polypeptides 1B1 and 1B3 are transporters selectively expressed on the basolateral membrane of the hepatocyte. Several studies reveal that they are involved in drug–drug interactions, cancer, and hyperbilirubinemia. In this study, we developed a set of classification models for OATP1B1 and 1B3 inhibition based on more than 1700 carefully curated compounds from literature, which were validated via cross-validation and by use of an external test set. After combining several sets of descriptors and classifiers, the 6 best models were selected according to their statistical performance and were used for virtual screening of DrugBank. Consensus scoring of the screened compounds resulted in the selection and purchase of nine compounds as potential dual inhibitors and of one compound as potential selective OATP1B3 inhibitor. Biological testing of the compounds confirmed the validity of the models, yielding an accuracy of 90% for OATP1B1 and 80% for OATP1B3, respectively. Moreover, at least half of the new identified inhibitors are associated with hyperbilirubinemia or hepatotoxicity, implying a relationship between OATP inhibition and these severe side effects.
Although new chemical entities are scarcely registered and patented after many years of inconclusive clinical trials, the involvement of ALR2 to inflammatory pathologies might renew the interest in the field of ARIs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.