2016
DOI: 10.1007/s13748-016-0094-0
|View full text |Cite
|
Sign up to set email alerts
|

Learning from imbalanced data: open challenges and future directions

Abstract: Despite more than two decades of continuous development learning from imbalanced data is still a focus of intense research. Starting as a problem of skewed distributions of binary tasks, this topic evolved way beyond this conception. With the expansion of machine learning and data mining, combined with the arrival of big data era, we have gained a deeper insight into the nature of imbalanced learning, while at the same time facing new emerging challenges. Data-level and algorithm-level methods are constantly b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

5
1,048
0
19

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 1,870 publications
(1,072 citation statements)
references
References 63 publications
5
1,048
0
19
Order By: Relevance
“…In particular, we share the opinions expressed in the position paper (Krawczyk 2016). According to it while decomposing the multiple imbalanced classes, pairwise relations between two classes only may be a too strong over-simplification and they do not reflect more complex relations between several of classes, as one class influences several neighboring classes at the same time.…”
Section: Multiple Imbalanced Classesmentioning
confidence: 89%
See 4 more Smart Citations
“…In particular, we share the opinions expressed in the position paper (Krawczyk 2016). According to it while decomposing the multiple imbalanced classes, pairwise relations between two classes only may be a too strong over-simplification and they do not reflect more complex relations between several of classes, as one class influences several neighboring classes at the same time.…”
Section: Multiple Imbalanced Classesmentioning
confidence: 89%
“…In this study, we decided to use the method where the type of example can be identified by analysing class labels of the k-nearest neighbours of this example. For instance, if k = 5, the type of the example is assigned in the following way (Napierala and Stefanowski 2012;2016): 5:0 or 4:1 -an example is labeled as a safe example; 3:2 or 2:3 -a borderline example; 1:4 -labeled as a rare example; 0:5 -example is labeled as an outlier. This rule can be generalized for higher k values, however, results of recent experiments (Napierala and Stefanowski 2016) show that they lead to a similar categorization of considered datasets.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations