How to design the fair experimental classifier evaluation

Stąpor, Katarzyna; Ksieniewicz, Paweł; García, Salvador; Woźniak, Michał

doi:10.1016/j.asoc.2021.107219

Cited by 62 publications

(24 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to assess whether DeepSMOTE returns statistically significantly better results than the reference resampling algorithms, we use the Friedman test with Shaffer post-hoc test [100] and the Bayesian Wilcoxon signed-rank test [101] for statistical comparison over multiple datasets. Both tests used a statistical significance level of 0.05.…”

Section: ) Statistical Analysis Of Resultsmentioning

confidence: 99%

DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data

Dablain

Krawczyk

Chawla

2023

IEEE Trans. Neural Netw. Learning Syst.

213

View full text Add to dashboard Cite

Despite over two decades of progress, imbalanced data is still considered a significant challenge for contemporary machine learning models. Modern advances in deep learning have further magnified the importance of the imbalanced data problem, especially when learning from images. Therefore, there is a need for an oversampling method that is specifically tailored to deep learning models, can work on raw images while preserving their properties, and is capable of generating highquality, artificial images that can enhance minority classes and balance the training set. We propose Deep synthetic minority oversampling technique (SMOTE), a novel oversampling algorithm for deep learning models that leverages the properties of the successful SMOTE algorithm. It is simple, yet effective in its design. It consists of three major components: 1) an encoder/decoder framework; 2) SMOTE-based oversampling; and 3) a dedicated loss function that is enhanced with a penalty term. An important advantage of DeepSMOTE over generative adversarial network (GAN)-based oversampling is that DeepSMOTE does not require a discriminator, and it generates high-quality artificial images that are both information-rich and suitable for visual inspection. DeepSMOTE code is publicly available at https://github.com/dd1github/DeepSMOTE.

show abstract

Section: ) Statistical Analysis Of Resultsmentioning

confidence: 99%

DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data

Dablain

Krawczyk

Chawla

2023

IEEE Trans. Neural Netw. Learning Syst.

213

View full text Add to dashboard Cite

show abstract

“…To further verify the superiority of HMCBCG over bagging, the experimental results of KNE, KNU and DESKNN under different k max values are subjected to paired t-tests with bagging, respectively. The paired t-test is recommended for the comparison of two classifiers on one dataset [ 53 , 54 ]. A p-value less than 0.05 is considered statistically significant in this study.…”

Section: Experiments and Resultsmentioning

confidence: 99%

A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count

Shen

et al. 2021

Computer Methods and Programs in Biomedicine

View full text Add to dashboard Cite

Background : As blood testing is radiation-free, low-cost and simple to operate, some researchers use machine learning to detect COVID-19 from blood test data. However, few studies take into consideration the imbalanced data distribution, which can impair the performance of a classifier. Method : A novel combined dynamic ensemble selection (DES) method is proposed for imbalanced data to detect COVID-19 from complete blood count. This method combines data preprocessing and improved DES. Firstly, we use the hybrid synthetic minority over-sampling technique and edited nearest neighbor (SMOTE-ENN) to balance data and remove noise. Secondly, in order to improve the performance of DES, a novel hybrid multiple clustering and bagging classifier generation (HMCBCG) method is proposed to reinforce the diversity and local regional competence of candidate classifiers. Results : The experimental results based on three popular DES methods show that the performance of HMCBCG is better than only use bagging. HMCBCG+KNE obtains the best performance for COVID-19 screening with 99.81% accuracy, 99.86% F1, 99.78% G-mean and 99.81% AUC. Conclusion : Compared to other advanced methods, our combined DES model can improve accuracy, G-mean, F1 and AUC of COVID-19 screening.

show abstract

“…Empirical evidence proves that accuracy is strongly biased to favor the majority class and might produce misleading conclusions. This fact motivated a search for new balanced measures obtaining a trade-off between positive and negative class performances [16]. Examples of such metrics are the arithmetic (eq.5), geometric (eq.6 or eq.7) or harmonic means (eq.9) between the two components: recall and precision (or specificity).…”

Section: A Imbalanced Data Classification 1) Metricsmentioning

confidence: 99%

Multicriteria Classifier Ensemble Learning for Imbalanced Data

2022

View full text Add to dashboard Cite

One of the vital problems with the imbalanced data classifier training is the definition of an optimization criterion. Typically, since the exact cost of misclassification of the individual classes is unknown, combined metrics and loss functions that roughly balance the cost for each class are used. However, this approach can lead to a loss of information, since different trade-offs between class misclassification rates can produce similar combined metric values. To address this issue, this paper discusses a multi-criteria ensemble training method for the imbalanced data. The proposed method jointly optimizes precision and recall, and provides the end-user with a set of Pareto optimal solutions, from which the final one can be chosen according to the user's preference. The proposed approach was evaluated on a number of benchmark datasets and compared with the single-criterion approach (where the selected criterion was one of the chosen metrics). The results of the experiments confirmed the usefulness of the obtained method, which on the one hand guarantees good quality, i.e., not worse than the one obtained with the use of single-criterion optimization, and on the other hand, offers the user the opportunity to choose the solution that best meets their expectations regarding the trade-off between errors on the minority and the majority class.

show abstract

How to design the fair experimental classifier evaluation

Cited by 62 publications

References 29 publications

DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data

DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data

A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count

Multicriteria Classifier Ensemble Learning for Imbalanced Data

Contact Info

Product

Resources

About