2020
DOI: 10.1016/j.procs.2020.08.043
|View full text |Cite
|
Sign up to set email alerts
|

A Resampling Method for Imbalanced Datasets Considering Noise and Overlap

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 43 publications
(24 citation statements)
references
References 16 publications
0
24
0
Order By: Relevance
“…For finding the hyperparameters described in Table 2 , we followed the subsequent process: (1) The ML algorithm was selected: Random Forest, Decision Tree, MLP, CNN, Logistic Regression, or SVM; (2) the hyperparameter range for the selected ML algorithm was established, and we established different values for every hyperparameter to explore several combinations; (3) techniques for imbalanced datasets were established, and we applied two types, data-level (random undersampling, random oversampling, and SMOTE-Tomek [ 50 ]) and algorithm-level (balanced weights and cost-sensitive); and (4) for each technique toward imbalanced data sets, a hyperparameter search was implemented using the selected ML algorithm, its hyperparameter range, stratified k-fold cross-validation, and the balanced accuracy for evaluation. The ML algorithm was trained with the first hyperparameter combination using stratified k-fold cross-validation with over the training set.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…For finding the hyperparameters described in Table 2 , we followed the subsequent process: (1) The ML algorithm was selected: Random Forest, Decision Tree, MLP, CNN, Logistic Regression, or SVM; (2) the hyperparameter range for the selected ML algorithm was established, and we established different values for every hyperparameter to explore several combinations; (3) techniques for imbalanced datasets were established, and we applied two types, data-level (random undersampling, random oversampling, and SMOTE-Tomek [ 50 ]) and algorithm-level (balanced weights and cost-sensitive); and (4) for each technique toward imbalanced data sets, a hyperparameter search was implemented using the selected ML algorithm, its hyperparameter range, stratified k-fold cross-validation, and the balanced accuracy for evaluation. The ML algorithm was trained with the first hyperparameter combination using stratified k-fold cross-validation with over the training set.…”
Section: Methodsmentioning
confidence: 99%
“…The MLP comprises three dense layers using the SMOTE-tomek technique [ 50 ] with the following hyperparameters: five neurons for the first two layers and one neuron at the output layer since the problem is binary. An ReLU was the activation function for the first two layers, and a sigmoid function was used at the output layer.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Author et al: A real-life machine learning experience for predicting school dropout at different stages techniques introduced above), there are 783 instances corresponding to students who drop out of college and 635 to students who complete their studies. To equally distribute the classes with a 1:1 ratio, we use a combination of SMOTE and Tomek Links methods [21], [22]. It takes observations from the datasets and looks for other observations that are closer (neighbours).…”
Section: ) Resamplingmentioning
confidence: 99%