2021
DOI: 10.1109/tpami.2019.2929166
|View full text |Cite
|
Sign up to set email alerts
|

Multiset Feature Learning for Highly Imbalanced Data Classification

Abstract: With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio of data is high, most existing imbalanced learning methods decline in classification performance. To address this problem, a few highly imbalanced learning methods have been presented. However, most of them are still sensitive to the high imbalance ratio. This work aims to provide an effective solution for the highly imbalanced data classification problem. We conduct highly imbalanced learning from the perspective of f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
46
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 106 publications
(47 citation statements)
references
References 59 publications
1
46
0
Order By: Relevance
“…The experiments in the first block were performed on artificial data sets taken from the paper by Napierala et al (2010) because using synthetic data allows us to know their characteristics a priori and analyze the effects of resampling in a fully controlled environment. The second group of experiments was on a well-known benchmark suite of real-life databases widely used for class imbalance problems (Chen et al, 2019;Jing et al, 2019;Kovács, 2019;Kuncheva et al, 2019;Lopez-Garcia et al, 2019), which are all available at the KEEL database repository (Alcalá-Fdez et al, 2011). The results of both experiments were estimated by 5-fold stratified cross-validation in order to have a sufficient amount of positive examples in the test partitions.…”
Section: Methodsmentioning
confidence: 99%
“…The experiments in the first block were performed on artificial data sets taken from the paper by Napierala et al (2010) because using synthetic data allows us to know their characteristics a priori and analyze the effects of resampling in a fully controlled environment. The second group of experiments was on a well-known benchmark suite of real-life databases widely used for class imbalance problems (Chen et al, 2019;Jing et al, 2019;Kovács, 2019;Kuncheva et al, 2019;Lopez-Garcia et al, 2019), which are all available at the KEEL database repository (Alcalá-Fdez et al, 2011). The results of both experiments were estimated by 5-fold stratified cross-validation in order to have a sufficient amount of positive examples in the test partitions.…”
Section: Methodsmentioning
confidence: 99%
“…As such, advancing the development of algorithms and approaches for improved identification of rare classes is a key challenge for deep learning-based taxonomic identification (25). Solutions to this challenge could be inspired by class resampling and cost-sensitive training (91) or by multiset feature learning (92).…”
Section: Potential Deep Learning Applications In Entomologymentioning
confidence: 99%
“…Note that the dataset is common in both the criteria, giving us a total of 11 datasets. We choose these two categories because they are of special interest in research related to imbalanced datasets and have received extensive attention in this research area (Anand et al 2010;Hooda et al 2018;Jing et al 2019;Blagus and Lusa 2013).…”
Section: Datasets Used For Validationmentioning
confidence: 99%