The International Conference on Harmonization (ICH) M7 guideline allows the use of in silico approaches for predicting Ames mutagenicity for the initial assessment of impurities in pharmaceuticals. This is the first international guideline that addresses the use of quantitative structure–activity relationship (QSAR) models in lieu of actual toxicological studies for human health assessment. Therefore, QSAR models for Ames mutagenicity now require higher predictive power for identifying mutagenic chemicals. To increase the predictive power of QSAR models, larger experimental datasets from reliable sources are required. The Division of Genetics and Mutagenesis, National Institute of Health Sciences (DGM/NIHS) of Japan recently established a unique proprietary Ames mutagenicity database containing 12140 new chemicals that have not been previously used for developing QSAR models. The DGM/NIHS provided this Ames database to QSAR vendors to validate and improve their QSAR tools. The Ames/QSAR International Challenge Project was initiated in 2014 with 12 QSAR vendors testing 17 QSAR tools against these compounds in three phases. We now present the final results. All tools were considerably improved by participation in this project. Most tools achieved >50% sensitivity (positive prediction among all Ames positives) and predictive power (accuracy) was as high as 80%, almost equivalent to the inter-laboratory reproducibility of Ames tests. To further increase the predictive power of QSAR tools, accumulation of additional Ames test data is required as well as re-evaluation of some previous Ames test results. Indeed, some Ames-positive or Ames-negative chemicals may have previously been incorrectly classified because of methodological weakness, resulting in false-positive or false-negative predictions by QSAR tools. These incorrect data hamper prediction and are a source of noise in the development of QSAR models. It is thus essential to establish a large benchmark database consisting only of well-validated Ames test results to build more accurate QSAR models.
Purpose Oral bioavailability (%F) is a key factor that determines the fate of a new drug in clinical trials. Traditionally, %F is measured using costly and time -consuming experimental tests. Developing computational models to evaluate the %F of new drugs before they are synthesized would be beneficial in the drug discovery process. Methods We employed Combinatorial Quantitative Structure-Activity Relationship approach to develop several computational %F models. We compiled a %F dataset of 995 drugs from public sources. After generating chemical descriptors for each compound, we used random forest, support vector machine, k nearest neighbor, and CASE Ultra to develop the relevant QSAR models. The resulting models were validated using five-fold cross-validation. Results The external predictivity of %F values was poor (R2=0.28, n=995, MAE=24), but was improved (R2=0.40, n=362, MAE=21) by filtering unreliable predictions that had a high probability of interacting with MDR1 and MRP2 transporters. Furthermore, classifying the compounds according to the %F values (%F<50% as “low”, %F≥50% as ‘high”) and developing category QSAR models resulted in an external accuracy of 76%. Conclusions In this study, we developed predictive %F QSAR models that could be used to evaluate new drug compounds, and integrating drug-transporter interactions data greatly benefits the resulting models.
Fragment based expert system models of toxicological end points are primarily comprised of a set of substructures that are statistically related to the toxic property in question. These special substructures are often referred to as toxicity alerts, toxicophores, or biophores. They are the main building blocks/classifying units of the model, and it is important to define the chemical structural space within which the alerts are expected to produce reliable predictions. Furthermore, defining an appropriate applicability domain is required as part of the OECD guidelines for the validation of quantitative structure-activity relationships (QSARs). In this respect, this paper describes a method to construct applicability domains for individual toxicity alerts that are part of the CASE Ultra expert system models. Defining applicability domain for individual alerts was necessary because each CASE Ultra model is comprised of multiple alerts, and different alerts of a model usually represent different toxicity mechanisms and cover different structural space; the use of an applicability domain for the overall model is often not adequate. The domain for each alert was constructed using a set of fragments that were found to be statistically related to the end point in question as opposed to using overall structural similarity or physicochemical properties. Use of the applicability domains in reducing false positive predictions is demonstrated. It is now possible to obtain ROC (receiver operating characteristic) profiles of CASE Ultra models by applying domain adherence cutoffs on the alerts identified in test chemicals. This helps in optimizing the performance of a model based on their true positive-false positive prediction trade-offs and reduce drastic effects on the predictive performance caused by the active/inactive ratio of the model's training set. None of the major currently available commercial expert systems for toxicity prediction offer the possibility to explore a model's full range of sensitivity-specificity spectrum, and therefore, the methodology developed in this study can be of benefit in improving the predictive ability of the alert based expert systems.
The Multiple Computer Automated Structure Evaluation (MCASE) program was used to evaluate the mutagenic potential of organic compounds. The experimental Ames test mutagenic activities for 2513 chemicals were collected from various literature sources. All chemicals have experimental results in one or more Salmonella tester strains. A general mutagenicity data set and fifteen individual Salmonella test strain data sets were compiled. Analysis of the learning sets by the MCASE program resulted in the derivation of good correlations between chemical structure and mutagenic activity. Significant improvement was obtained as more data was added to the learning databases when compared with the results of our previous reports. Several biophores were identified as being responsible for the mutagenic activity of the majority of active chemicals in each individual mutagenicity module. It was shown that the multiple-database mutagenicity model showed a clear advantage over normally used single-database models. The expertise produced by this analysis can be used to predict the mutagenic potential of new compounds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.