Integrating structure annotation and machine learning approaches to develop graphene toxicity models

Wang, Tong; Russo, Daniel P.; Bitounis, Dimitrios; Demokritou, Philip; Jia, Xuelian; Huang, Heng; Zhu, Hao

doi:10.1016/j.carbon.2022.12.065

Cited by 11 publications

(7 citation statements)

References 78 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data heterogeneity encompasses both the wide variety of categorical data and the huge range of variation in numerical data, and thus the handling methods are different. Methods of handling categorical data include one-hot coding (which converts categorical values to numeric ones), 51,55 and methods of handling numerical data include normalization, 56 standardization 57 and logarithm-scaling. 58 Commonly used methods for handling missing values, from simple to complex, include:…”

Section: Preparation For Modeling: Data Collection and Descriptor Sel...mentioning

confidence: 99%

“…Additionally, when a new material is added for prediction, researchers need to identify descriptors that can appropriately capture its physicochemical properties. For example, Wang et al 57 developed a set of geometrical descriptors for graphene based on the nanostructure annotation techniques. In another study of Sengottiyan et al, 92 the atomic molecular weight and the number of hybridized carbon atoms were used to describe the structure of the core and coating of organic NMs.…”

Section: The Materialsmentioning

confidence: 99%

See 1 more Smart Citation

Application of Machine Learning in Nanotoxicology: A Critical Review and Perspective

Zhou,

Wang,

Peijnenburg

et al. 2024

Environ. Sci. Technol.

View full text Add to dashboard Cite

The massive production and application of nanomaterials (NMs) have raised concerns about the potential adverse effects of NMs on human health and the environment. Evaluating the adverse effects of NMs by laboratory methods is expensive, time-consuming, and often fails to keep pace with the invention of new materials. Therefore, in silico methods that utilize machine learning techniques to predict the toxicity potentials of NMs are a promising alternative approach if regulatory confidence in them can be enhanced. Previous reviews and regulatory OECD guidance documents have discussed in detail how to build an in silico predictive model for NMs. Nevertheless, there is still room for improvement in addressing the ways to enhance the model representativeness and performance from different angles, such as data set curation, descriptor selection, task type (classification/regression), algorithm choice, and model evaluation (internal and external validation, applicability domain, and mechanistic interpretation, which is key to ensuring stakeholder confidence). This review explores how to build better predictive models; the current state of the art is analyzed via a statistical evaluation of literature, while the challenges faced and future perspectives are summarized. Moreover, a recommended workflow and best practices are provided to help in developing more predictive, reliable, and interpretable models that can assist risk assessment as well as safe-by-design development of NMs.

show abstract

Section: Preparation For Modeling: Data Collection and Descriptor Sel...mentioning

confidence: 99%

Section: The Materialsmentioning

confidence: 99%

Application of Machine Learning in Nanotoxicology: A Critical Review and Perspective

Zhou,

Wang,

Peijnenburg

et al. 2024

Environ. Sci. Technol.

View full text Add to dashboard Cite

show abstract

“…These projections were then used as inputs to predict the properties and activities of nanoparticles using an image processing convolutional neural network (CNN). Structure annotation techniques, such as Delaunay tessellation, which decomposes the surface of nanostructures into tetrahedra, have been developed to generate nanodescriptors that simulate surface chemistry and properties of complex nanoparticle structures (Figure F). , Overall, structure-based modeling, such as QSAR, is reliable in predicting some pharmacokinetic properties and in vitro assay responses with simple mechanisms for new compounds. , However, for complex toxicity endpoints (e.g., carcinogenicity and hepatotoxicity), the use of only structural information and chemical properties for modeling (i.e., QSAR) is error-prone, particularly when compounds with similar structures or chemical properties exhibit dissimilar toxicities …”

Section: Feature Data In Computational Toxicology Modelingmentioning

confidence: 99%

Advancing Computational Toxicology by Interpretable Machine Learning

Jia

Wang

Zhu

2023

Environ. Sci. Technol.

Self Cite

View full text Add to dashboard Cite

Chemical toxicity evaluations for drugs, consumer products, and environmental chemicals have a critical impact on human health. Traditional animal models to evaluate chemical toxicity are expensive, time-consuming, and often fail to detect toxicants in humans. Computational toxicology is a promising alternative approach that utilizes machine learning (ML) and deep learning (DL) techniques to predict the toxicity potentials of chemicals. Although the applications of ML-and DL-based computational models in chemical toxicity predictions are attractive, many toxicity models are "black boxes" in nature and difficult to interpret by toxicologists, which hampers the chemical risk assessments using these models. The recent progress of interpretable ML (IML) in the computer science field meets this urgent need to unveil the underlying toxicity mechanisms and elucidate the domain knowledge of toxicity models. In this review, we focused on the applications of IML in computational toxicology, including toxicity feature data, model interpretation methods, use of knowledge base frameworks in IML development, and recent applications. The challenges and future directions of IML modeling in toxicology are also discussed. We hope this review can encourage efforts in developing interpretable models with new IML algorithms that can assist new chemical assessments by illustrating toxicity mechanisms in humans.

show abstract

“…14,15 ML models can utilize physicochemical and structural properties, such as particle size, surface charge, and composition, to estimate the likelihood of adverse effects. 9 Additionally, ML techniques have been exploited for establishing QSAR, shedding light on the molecular mechanisms underlying toxicity and guiding safer-by-design NM. 16 Furthermore, ML can address challenges in data imputation and integration, facilitating comprehensive and reliable toxicological analyses.…”

mentioning

confidence: 99%

“…Hence, it is important to progress toward a modeling-based approach and develop good predictor-based case studies to support the transition. Data modeling can be done via different methods, e.g., quantitative structure–activity relationship (QSAR) analysis and machine learning (ML), among others, where many material features can be analyzed at the same time and used to predict toxicity. − …”

mentioning

confidence: 99%

Machine Learning Allowed Interpreting Toxicity of a Fe-Doped CuO NM Library Large Data Set─An Environmental In Vivo Case Study

Scott-Fordsmand,

Gomes,

Pokhrel

et al. 2024

ACS Appl. Mater. Interfaces

View full text Add to dashboard Cite

The wide variation of nanomaterial (NM) characters (size, shape, and properties) and the related impacts on living organisms make it virtually impossible to assess their safety; the need for modeling has been urged for long. We here investigate the custom-designed 1−10% Fe-doped CuO NM library. Effects were assessed using the soil ecotoxicology model Enchytraeus crypticus (Oligochaeta) in the standard 21 days plus its extension (49 days). Results showed that 10%Fe-CuO was the most toxic (21 days reproduction EC50 = 650 mg NM/kg soil) and Fe 3 O 4 NM was the least toxic (no effects up to 3200 mg NM/kg soil). All other NMs caused similar effects to E. crypticus (21 days reproduction EC50 ranging from 875 to 1923 mg NM/kg soil, with overlapping confidence intervals). Aiming to identify the key NM characteristics responsible for the toxicity, machine learning (ML) modeling was used to analyze the large data set [9 NMs, 68 descriptors, 6 concentrations, 2 exposure times (21 and 49 days), 2 endpoints (survival and reproduction)]. ML allowed us to separate experimental related parameters (e.g., zeta potential) from particle-specific descriptors (e.g., force vectors) for the best identification of important descriptors. We observed that concentration-dependent descriptors (environmental parameters, e.g., zeta potential) were the most important under standard test duration (21 day) but not for longer exposure (closer representation of real-world conditions). In the longer exposure (49 days), the particle-specific descriptors were more important than the concentration-dependent parameters. The longer-term exposure showed that the steepness of the concentration−response decreased with an increased Fe content in the NMs. Longer-term exposure should be a requirement in the hazard assessment of NMs in addition to the standard in OECD guidelines for chemicals. The progress toward ML analysis is desirable given its need for such large data sets and significant power to link NM descriptors to effects in animals. This is beyond the current univariate and concentration−response modeling analysis.

show abstract

Integrating structure annotation and machine learning approaches to develop graphene toxicity models

Cited by 11 publications

References 78 publications

Application of Machine Learning in Nanotoxicology: A Critical Review and Perspective

Application of Machine Learning in Nanotoxicology: A Critical Review and Perspective

Advancing Computational Toxicology by Interpretable Machine Learning

Machine Learning Allowed Interpreting Toxicity of a Fe-Doped CuO NM Library Large Data Set─An Environmental In Vivo Case Study

Contact Info

Product

Resources

About