2014
DOI: 10.1080/1062936x.2014.942357
|View full text |Cite
|
Sign up to set email alerts
|

A novel approach to generate robust classification models to predict developmental toxicity from imbalanced datasets

Abstract: Computational models to predict the developmental toxicity of compounds are built on imbalanced datasets wherein the toxicants outnumber the non-toxicants. Consequently, the results are biased towards the majority class (toxicants). To overcome this problem and to obtain sensitive but also accurate classifiers, we followed an integrated approach wherein (i) Synthetic Minority Over Sampling (SMOTE) is used for re-sampling, (ii) genetic algorithm (GA) is used for variable selection and (iii) support vector machi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 35 publications
1
7
0
Order By: Relevance
“…For the feature representation of chemicals, a pool of 818 bidimensional molecular descriptors was calculated. In this respect, 212 were computed by using RDKit (2021.09.4 release), which include mostly constitutional, topological, and electro-topological indexes, and 606 from Mordred (1.2.0 version), which are autocorrelators widely employed in previous studies. , The entire list of descriptors is available as Supporting Information (File S3).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For the feature representation of chemicals, a pool of 818 bidimensional molecular descriptors was calculated. In this respect, 212 were computed by using RDKit (2021.09.4 release), which include mostly constitutional, topological, and electro-topological indexes, and 606 from Mordred (1.2.0 version), which are autocorrelators widely employed in previous studies. , The entire list of descriptors is available as Supporting Information (File S3).…”
Section: Methodsmentioning
confidence: 99%
“…Being considered as a default method for feature selection, , the SVM (Support Vector Machine) classifier was employed to weigh feature importance within each round of a 10-fold cross-validation run. All the available 818 features were thus ranked according to their importance by SVM.…”
Section: Methodsmentioning
confidence: 99%
“…The in vitro testing of pregnant animals, preferably rats and rabbits, allows for the prediction of toxic effects in both the dams and their fetuses. 86,87 In addition to traditional in vivo methods, computational approaches, including ML models 2,9,17,18,88–91 and DL models, 18 have been used as alternative methods to assess several endpoints of reproductive toxicity such as sperm reduction, gonadal dysgenesis, abnormal ovulation, teratogenicity, and infertility growth retardation.…”
Section: Toxicity Typesmentioning
confidence: 99%
“…Since SVM can handle correlated descriptors and has good generalization performance, it has been widely used in the development of models for predicting toxicity of chemicals, with 304 models reported. 18,25–27,29,32,34,35,64–70,82,89,93,101,116,117…”
Section: And DL Modelsmentioning
confidence: 99%
See 1 more Smart Citation