2014 22nd International Conference on Pattern Recognition 2014
DOI: 10.1109/icpr.2014.258
|View full text |Cite
|
Sign up to set email alerts
|

Handling Imbalanced Datasets by Partially Guided Hybrid Sampling for Pattern Recognition

Abstract: Occurrence of high imbalance in real-world domains is a direct result of rarity of interesting events, which results in skewed datasets. Without dataset rebalancing, the learning algorithm will encounter extremely low minority class samples therefore it gets biased towards the majority class in the classification tasks. Hence properly handling the imbalanced dataset is a crucial issue in the pattern recognition domain. We have employed bootstrapping by simultaneous oversampling of the minority class and unders… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
9
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(10 citation statements)
references
References 20 publications
1
9
0
Order By: Relevance
“…The authors utilize concurrent oversampling and undersampling to deal with heavily skewed data distributions. 68 Additionally, LVQ-SMOTE integrates the SMOTE ovesampler with feature codebooks learned by vector quantization to create synthetic samples that use more feature space than the other SMOTE variants. 69 Also, a comparable performance can be obtained using the Polynom-fit SMOTE, which oversamples the minority class using polynomial fitting functions.…”
Section: Oversampling Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The authors utilize concurrent oversampling and undersampling to deal with heavily skewed data distributions. 68 Additionally, LVQ-SMOTE integrates the SMOTE ovesampler with feature codebooks learned by vector quantization to create synthetic samples that use more feature space than the other SMOTE variants. 69 Also, a comparable performance can be obtained using the Polynom-fit SMOTE, which oversamples the minority class using polynomial fitting functions.…”
Section: Oversampling Methodsmentioning
confidence: 99%
“…Moreover, the G‐SMOTE approach pioneered hybrid oversampling that is driven by previously unknown patterns derived from the class with minor samples and randomization. The authors utilize concurrent oversampling and undersampling to deal with heavily skewed data distributions 68 . Additionally, LVQ‐SMOTE integrates the SMOTE ovesampler with feature codebooks learned by vector quantization to create synthetic samples that use more feature space than the other SMOTE variants 69 .…”
Section: End‐to‐end Machine Learning Framework: Design and Implementationmentioning
confidence: 99%
“…Moreover, the G-SMOTE method introduced hybrid oversampling, by the notion of doing partially sampling guided by the hidden patterns obtained from minority class and randomization. Highly skewed data distributions are handled by simultaneous over and undersampling (Sandhan & Choi (2014)). LVQ-SMOTE uses a combination of the SMOTE ovesampler and the feature codebooks obtained by the learning vector quantization in order to generate synthetic samples which occupy more feature space than the other SMOTE variants (Nakamura et al (2013)).…”
Section: Oversampling Methodsmentioning
confidence: 99%
“…In addition to two previously proposed methods utilizing the concept of class potential, Radial-Based Oversampling (RBO) [17] and Radial-Based Undersampling (RBU) [34], we considered several other state-of-the-art resampling strategies. We based our choice on a recent ranking constructed by Kovács [48], out of which we selected the following best-performing methods: SMOTE [18], Polynomial Fitting SMOTE (pf-SMOTE) [49], Oversampling with Rejection (Lee) [50], Synthetic Minority Oversampling Based on Sample Density (SMOBD) [51], Partially Guided Oversampling (G-SMOTE) [52], Learning Vector Quantization-based SMOTE (LVQ-SMOTE) [53], Assembled SMOTE (A-SMOTE) [54] and SMOTE combined with Tomek Links (SMOTE-TL) [55]. With the exception of RBO and RBU, the implementations of the reference methods provided in the smotevariants library [56] were utilized.…”
Section: Set-upmentioning
confidence: 99%