2022
DOI: 10.1021/acs.est.2c06155
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning-Based Models with High Accuracy and Broad Applicability Domains for Screening PMT/vPvM Substances

Abstract: Persistent, mobile, and toxic (PMT) substances and very persistent and very mobile (vPvM) substances can transport over long distances from various sources, increasing the public health risk. A rapid and high-throughput screening of PMT/vPvM substances is thus warranted to the risk prevention and mitigation measures. Herein, we construct a machine learning-based screening system integrated with five models for high-throughput classification of PMT/vPvM substances. The models are constructed with 44 971 substan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 46 publications
0
20
0
Order By: Relevance
“…The broad chemical space of the dataset is critical for the ML modeling and can widen AD of the classifiers, as can also be concluded from a previous study. 18 The number of compounds in most of the communities also increased after the data incorporation. It can be perceived that the more the communities in a dataset, the more the compounds in communities, the more the ML models can learn, and the higher performance the models have.…”
Section: ■ Results and Discussionmentioning
confidence: 93%
See 4 more Smart Citations
“…The broad chemical space of the dataset is critical for the ML modeling and can widen AD of the classifiers, as can also be concluded from a previous study. 18 The number of compounds in most of the communities also increased after the data incorporation. It can be perceived that the more the communities in a dataset, the more the compounds in communities, the more the ML models can learn, and the higher performance the models have.…”
Section: ■ Results and Discussionmentioning
confidence: 93%
“…For the 22 classifiers, the differences (δ) between A ROC on the training set and those on the test set range from 0.027 to 0.076 (Table S3). It can be concluded from previous studies that if δ/ A ROC ≤ 10% ( A ROC on the training set), the corresponding classifiers are free of overfitting. ,, For the 22 classifiers, the δ/ A ROC values are all ≤8%. Therefore, there is scarcely any overfitting in the constructed ML classifiers.…”
Section: Resultsmentioning
confidence: 96%
See 3 more Smart Citations