2022
DOI: 10.17977/um018v5i12022p87-100
|View full text |Cite
|
Sign up to set email alerts
|

The Effect of Resampling on Classifier Performance: an Empirical Study

Utomo Pujianto,
Muhammad Iqbal Akbar,
Niendhitta Tamia Lassela
et al.

Abstract: An imbalanced class on a dataset is a common classification problem. The effect of using imbalanced class datasets can cause a decrease in the performance of the classifier. Resampling is one of the solutions to this problem. This study used 100 datasets from 3 websites: UCI Machine Learning, Kaggle, and OpenML. Each dataset will go through 3 processing stages: the resampling process, the classification process, and the significance testing process between performance evaluation values of the combination of cl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 31 publications
0
3
0
Order By: Relevance
“…In addition, random down-sampling was implemented for Melanocytic Nevus and random up-sampling for the rest. This technique was evident to improve classi cation performance, according to research conducted by [29]. Figure 1 illustrates the image resize results, and Fig.…”
Section: Preprocessingmentioning
confidence: 88%
See 1 more Smart Citation
“…In addition, random down-sampling was implemented for Melanocytic Nevus and random up-sampling for the rest. This technique was evident to improve classi cation performance, according to research conducted by [29]. Figure 1 illustrates the image resize results, and Fig.…”
Section: Preprocessingmentioning
confidence: 88%
“…However, this method sometimes generates less-favorable results. More recently [29], [30], the proposed random up-sampling and down-sampling techniques by considering unbalanced datasets has been implemented to achieve a more improved classi cation performance. The last challenges in designing skin lesion CAD systems that employ handcrafted and automatic-based deep learning CNN algorithms are over tting the model requiring high cost in terms of time complexity.…”
Section: Introductionmentioning
confidence: 99%
“…This disparity might impact the model's performance or the upcoming analysis. The preprocessing stage involves a resampling method utilizing the synthetic minority oversampling technique (SMOTE) [19] to rectify the data quantity imbalance.…”
Section: Figure 1 Sentiment Analysis Stagesmentioning
confidence: 99%