2019
DOI: 10.2478/amcs-2019-0057
|View full text |Cite
|
Sign up to set email alerts
|

Using Information on Class Interrelations to Improve Classification of Multiclass Imbalanced Data: A New Resampling Algorithm

Abstract: The relations between multiple imbalanced classes can be handled with a specialized approach which evaluates types of examples’ difficulty based on an analysis of the class distribution in the examples’ neighborhood, additionally exploiting information about the similarity of neighboring classes. In this paper, we demonstrate that such an approach can be implemented as a data preprocessing technique and that it can improve the performance of various classifiers on multiclass imbalanced datasets. It has led us … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 37 publications
(14 citation statements)
references
References 27 publications
(57 reference statements)
0
14
0
Order By: Relevance
“…This indicates that the accuracy of at least one of the models is significantly different from others, hence the null hypothesis that all descriptors' performances are the same is rejected. Nemenyi's [51] post hoc test of the average rank of accuracies was performed with a critical difference (CD) of 5.1308. The top three performing descriptors were, 𝑃𝐶𝐴 + 𝐿𝐷𝐴 + 𝐺𝑎𝑏𝑜𝑟, 𝑃𝐶𝐴 + 𝐺𝑎𝑏𝑜𝑟 and Gabor in that order, while PCA was the worst-performing model with an average rank of 6.0.…”
Section: Resultsmentioning
confidence: 99%
“…This indicates that the accuracy of at least one of the models is significantly different from others, hence the null hypothesis that all descriptors' performances are the same is rejected. Nemenyi's [51] post hoc test of the average rank of accuracies was performed with a critical difference (CD) of 5.1308. The top three performing descriptors were, 𝑃𝐶𝐴 + 𝐿𝐷𝐴 + 𝐺𝑎𝑏𝑜𝑟, 𝑃𝐶𝐴 + 𝐺𝑎𝑏𝑜𝑟 and Gabor in that order, while PCA was the worst-performing model with an average rank of 6.0.…”
Section: Resultsmentioning
confidence: 99%
“…The main goal is to ascertain if there is any base classifiers whose performance is significantly different from others and also perform multiple comparison analysis. This was achieved by implementing non-parametric procedures [44,45] individually to each of the four categories of dataset-target setups for informed statistical inferences. Friedman testa non-parametric variant of the repeated-measures Analysis of Variance, was used to test the null hypothesis that there is no significant difference in the performances (accuracies and time costs) of the classifiers.…”
Section: Statistical Significance and Rank Validationmentioning
confidence: 99%
“…It can be seen that d c is the distance corresponding to the R * Mth value of d ij . (6) gives the expression of the distance δ i , representing the minimum distance from particle i to other particles that have a higher ρ i :…”
Section: Apso-rf Unbalanced Data Classification Modelmentioning
confidence: 99%
“…The most commonly used methods to solve the problem of class imbalance are 1) Resampling method [6], which through under-sampling and over-sampling methods to eliminate most class instances or increase a few class instances to change the original class distribution of unbalanced data; it would increase the misclassification of minority classes and loss information in general rules. 2)…”
Section: Introductionmentioning
confidence: 99%