2022
DOI: 10.1021/acs.est.2c00765
|View full text |Cite
|
Sign up to set email alerts
|

Graph Attention Network Model with Defined Applicability Domains for Screening PBT Chemicals

Abstract: In silico models for screening environmentally persistent, bio-accumulative, and toxic (PBT) substances are necessary for sound management of chemicals. Due to the complex structure–activity landscapes (SALs) on the PBT attributes, previous models for screening PBT chemicals lack either applicability domain (AD) characterizations or interpretability, restricting their applications. Herein, graph attention networks (GATs), a novel neural network architecture, were introduced to construct models for screening PB… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
50
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 33 publications
(50 citation statements)
references
References 45 publications
0
50
0
Order By: Relevance
“…AD Characterization. Based on previous research gains 19,20 and SAL analysis, an AD characterization method abbreviated as AD SAL {ρ s , I A } was raised in the current study, where ρ s stands for molecular similarity density and I A describes inconsistency in molecular activities or molecular local discontinuity. ρ s defines weighted similarity density between a query compound and training compounds:…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…AD Characterization. Based on previous research gains 19,20 and SAL analysis, an AD characterization method abbreviated as AD SAL {ρ s , I A } was raised in the current study, where ρ s stands for molecular similarity density and I A describes inconsistency in molecular activities or molecular local discontinuity. ρ s defines weighted similarity density between a query compound and training compounds:…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
“…Recently, an AD characterization (AD FP-AC ), where the subscript "FP" represents molecular fingerprints and "AC" represents ACs in SALs, was proposed. 19,20 The AD FP-AC employs local discontinuity scores (S D ) to identify compounds on the ACs and was proven to outperform a previous AD characterization method for ML-based QSARs. 20 The AD FP-AC calculates molecular similarity (S M ), employs a cutoff value of S M (S cutoff ) to determine neighboring training compounds of a query compound, and employs the neighboring training compounds to calculate arithmetically the average S M and S D .…”
Section: ■ Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…13 Thus, machine learning (ML) has attracted attention for the rapid and accurate prediction of substance properties crucial to the identification of PMT/ vPvM substances because of its versatility. 14,15 A key to the success of ML modeling is the need for diverse datasets and appropriate algorithms, which directly influence applicability domains (ADs) and generalization ability. 16,17 Limited data for ML model construction may lead to overfitting, narrow ADs, and poor generalization, whereas inadequate ML algorithms only serve as a black box without any interpretation value to reveal the mechanism that relates imported information to the modeling results.…”
Section: ■ Introductionmentioning
confidence: 99%
“…However, there is a pressing demand for low-cost and high-throughput protocols for screening PMT/vPvM substances, as the testing process for acquiring these parameters requires numerous expensive and time-consuming experimental protocols . Thus, machine learning (ML) has attracted attention for the rapid and accurate prediction of substance properties crucial to the identification of PMT/vPvM substances because of its versatility. , A key to the success of ML modeling is the need for diverse datasets and appropriate algorithms, which directly influence applicability domains (ADs) and generalization ability. , Limited data for ML model construction may lead to overfitting, narrow ADs, and poor generalization, whereas inadequate ML algorithms only serve as a black box without any interpretation value to reveal the mechanism that relates imported information to the modeling results. ML models built upon big data and multi-algorithms are thus urgently needed as a tool for the rapid and accurate identification of PMT/vPvM substances.…”
Section: Introductionmentioning
confidence: 99%