Earlier we created a chemical hazard database via natural language processing of dossiers submitted to the European Chemical Agency with approximately 10 000 chemicals. We identified repeat OECD guideline tests to establish reproducibility of acute oral and dermal toxicity, eye and skin irritation, mutagenicity and skin sensitization. Based on 350–700+ chemicals each, the probability that an OECD guideline animal test would output the same result in a repeat test was 78%–96% (sensitivity 50%–87%). An expanded database with more than 866 000 chemical properties/hazards was used as training data and to model health hazards and chemical properties. The constructed models automate and extend the read-across method of chemical classification. The novel models called RASARs (read-across structure activity relationship) use binary fingerprints and Jaccard distance to define chemical similarity. A large chemical similarity adjacency matrix is constructed from this similarity metric and is used to derive feature vectors for supervised learning. We show results on 9 health hazards from 2 kinds of RASARs—“Simple” and “Data Fusion”. The “Simple” RASAR seeks to duplicate the traditional read-across method, predicting hazard from chemical analogs with known hazard data. The “Data Fusion” RASAR extends this concept by creating large feature vectors from all available property data rather than only the modeled hazard. Simple RASAR models tested in cross-validation achieve 70%–80% balanced accuracies with constraints on tested compounds. Cross validation of data fusion RASARs show balanced accuracies in the 80%–95% range across 9 health hazards with no constraints on tested compounds.
SummaryGrouping of substances and utilizing read-across of data within those groups represents an important data gap filling technique for chemical safety assessments. Categories/analogue groups are typically developed based on structural similarity and, increasingly often, also on mechanistic (biological) similarity. While read-across can play a key role in complying with legislation such as the European REACH regulation, the lack of consensus regarding the extent and type of evidence necessary to support it often hampers its successful application and acceptance by regulatory authorities. Despite a potentially broad user community, expertise is still concentrated across a handful of organizations and individuals. In order to facilitate the effective use of read-across, this document presents the state of the art, summarizes insights learned from reviewing ECHA published decisions regarding the relative successes/pitfalls surrounding read-across under REACH, and compiles the relevant activities and guidance documents. Special emphasis is given to the available existing tools and approaches, an analysis of ECHA's published final decisions associated with all levels of compliance checks and testing proposals, the consideration and expression of uncertainty, the use of biological support data, and the impact of the ECHA Read-Across Assessment Framework (RAAF) published in 2015.
SummaryPublic data from ECHA online dossiers on 9,801 substances encompassing 326,749 experimental key studies and additional information on classification and labeling were made computable. Eye irritation hazard, for which the rabbit Draize eye test still represents the reference method, was analyzed. Dossiers contained 9,782 Draize eye studies on 3,420 unique substances, indicating frequent retesting of substances. This allowed assessment of the test’s reproducibility based on all substances tested more than once. There was a 10% chance of a non-irritant evaluation after a prior severe-irritant result according to UN GHS classification criteria. The most reproducible outcomes were the results negative (94% reproducible) and severe eye irritant (73% reproducible).To evaluate whether other GHS categorizations predict eye irritation, we built a dataset of 5,629 substances (1,931 “irritant” and 3,698 “non-irritant”). The two best decision trees with up to three other GHS classifications resulted in balanced accuracies of 68% and 73%, i.e., in the rank order of the Draize rabbit eye test itself, but both use inhalation toxicity data (“May cause respiratory irritation”), which is not typically available.Next, a dataset of 929 substances with at least one Draize study was mapped to PubChem to compute chemical similarity using 2D conformational fingerprints and Tanimoto similarity. Using a minimum similarity of 0.7 and simple classification by the closest chemical neighbor resulted in balanced accuracy from 73% over 737 substances to 100% at a threshold of 0.975 over 41 substances. This represents a strong support of read-across and (Q)SAR approaches in this area.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.