Preliminary knowledge about the toxicity of new substances for food use may contribute to the rapid selection of useful and increasingly safe substances. For this purpose, a Quantitative Structure-Toxicity Relationship (QSTR) model was developed with 139,395 structures obtained in three different lists of toxic (US EPA DSSTox) and non-toxic (FEMA GRAS ™ and FDA GRAS) substances. The 2D coordinates were obtained, standardized and checked, and a total of 4,860 fingerprints fragments defined by Klekota and Roth were calculated for each substance and used as independent variables. The data were processed in order to remove highly correlated variables and fragments close to zero variance, reducing fragments to 166. Dependent variables consisted of a binary classification, where zero corresponds to non-toxic whereas 1 corresponds to toxic. The classification models were created with decision tree using the J48 algorithm and random tree. The models (training, cross-validation and external validation) were evaluated based on their predictive performance. The best selected model was the random tree to obtain the best values external validation (accuracy = 0.9658, sensitivity = 0.9798, specificity = 0.5495, efficiency = 0.7640 and phi coefficient = 0.4941). The developed of a QSTR model can be used to predict the toxicity of novel food additives, manufacturing technology adjuvants and nutraceuticals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.