2019
DOI: 10.1016/j.neucom.2017.01.118
|View full text |Cite
|
Sign up to set email alerts
|

REMEDIAL-HwR: Tackling multilabel imbalance through label decoupling and data resampling hybridization

Abstract: The learning from imbalanced data is a deeply studied problem in standard classification and, in recent times, also in multilabel classification. A handful of multilabel resampling methods have been proposed in late years, aiming to balance the labels distribution. However these methods have to face a new obstacle, specific for multilabel data, as is the joint appearance of minority and majority labels in the same data patterns. We proposed recently a new algorithm designed to decouple imbalanced labels concur… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(33 citation statements)
references
References 40 publications
0
33
0
Order By: Relevance
“…We show that optimizing this objective simultaneously optimizes Shannon entropy, Gini-entropy and its modified variant, as well as the multi-class classification error. We expect that discussed tools can be used to obtain theoretical guarantees in the multi-label [28][29][30] and memory-constrained settings (we will explore this research direction in the future). We also consider extensions to different variants of the multi-class classification problem [31,32] and multi-output learning tasks [33,34].…”
Section: Discussionmentioning
confidence: 99%
“…We show that optimizing this objective simultaneously optimizes Shannon entropy, Gini-entropy and its modified variant, as well as the multi-class classification error. We expect that discussed tools can be used to obtain theoretical guarantees in the multi-label [28][29][30] and memory-constrained settings (we will explore this research direction in the future). We also consider extensions to different variants of the multi-class classification problem [31,32] and multi-output learning tasks [33,34].…”
Section: Discussionmentioning
confidence: 99%
“…TCS. This metric (8) was presented in [13] as a straightforward way to assess the theoretical complexity of an MLD. It is based on just three traits of the dataset: f stands for the amount of input features, k for the number of labels, and ls is the total number of label combinations in D. The larger is the value returned by this measurement, the harder would be to learn a predictive model from the dataset.…”
Section: Characterization Metricsmentioning
confidence: 99%
“…Among them, the SMOTE algorithm is one of the most popular (Bowyer, Chawla, Hall, & Kegelmeyer, ). Some techniques were also proposed for MLC problems (Charte & Charte, ; Zhang, Li, & Liu, ; Charte, Rivera, del Jesus, & Herrera, ); to name a few, however, some of them are strategy dependent, whereas others favour a specific transformation approach. In this work, SMOTE is applied as a local solution, dispensing the use of an intrinsically multilabel oversampling technique.…”
Section: Food Truck Recommendationmentioning
confidence: 99%
“…The use of these three measures can quantify and assess the occurrence or absence of the previously mentioned label problems. Some works related to label imbalanced (Charte, Rivera, del Jesús, & Herrera, 2015;Charte et al, 2017) tried to increase the multilabel measures, especially the macro-F1. However, they did not investigate the occurrence of these problems.…”
Section: Label Prediction Problemsmentioning
confidence: 99%
See 1 more Smart Citation