2015
DOI: 10.4236/jdaip.2015.33004
|View full text |Cite
|
Sign up to set email alerts
|

An Improved Algorithm for Imbalanced Data and Small Sample Size Classification

Abstract: Traditional classification algorithms perform not very well on imbalanced data sets and small sample size. To deal with the problem, a novel method is proposed to change the class distribution through adding virtual samples, which are generated by the windowed regression over-sampling (WRO) method. The proposed method WRO not only reflects the additive effects but also reflects the multiplicative effect between samples. A comparative study between the proposed method and other over-sampling methods such as syn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 10 publications
0
5
0
1
Order By: Relevance
“…Wisconsin breast cancer dataset (Ebenuwa et al, 2019) has multivariate data types, all 10 instances are integer types and it has 699 instances. Yeast dataset (Hu et al, 2015) have 8 real attributes with 1,484 instances. Various kinds of datasets from the Keel dataset repository (Verbiest et al, 2012;Ahmed et al, 2019;Gong and Kim, 2017;Jedrzejowicz et al, 2018;Fernández et al, 2017;Wang, 2019) are mostly used in handling imbalanced datasets.…”
Section: Used Dataset In Researchesmentioning
confidence: 99%
See 1 more Smart Citation
“…Wisconsin breast cancer dataset (Ebenuwa et al, 2019) has multivariate data types, all 10 instances are integer types and it has 699 instances. Yeast dataset (Hu et al, 2015) have 8 real attributes with 1,484 instances. Various kinds of datasets from the Keel dataset repository (Verbiest et al, 2012;Ahmed et al, 2019;Gong and Kim, 2017;Jedrzejowicz et al, 2018;Fernández et al, 2017;Wang, 2019) are mostly used in handling imbalanced datasets.…”
Section: Used Dataset In Researchesmentioning
confidence: 99%
“…Abalone dataset (He et al, 2008) contains 4,177 instances with 8 attributes, attributes types are categorical. Glass dataset (Hu et al, 2015) with multivariate data types and 214 instances. E coli 2 dataset (Elhassan and Aljurf, 2016) carries 363 instances and 7 attributes.…”
Section: Used Dataset In Researchesmentioning
confidence: 99%
“…Among the results, the city's 294,000 samples were obtained by SMOTE. It is very likely that the original characteristics of the city category will be lost, resulting in inaccuracies in the classification results [69].…”
Section: Stratified Smote Algorithmmentioning
confidence: 99%
“…[20]. F-ölçütü değeri, dengesiz veri kümesinin azınlık sınıfına ait sınıflama başarısını değerlendirmek için kullanılırken, Gortalama ise tüm sınıfların geneli için değerlendirme yapmaktadır [21], [22].…”
Section: Literatür öZeti Ve Ilgili çAlışmalarunclassified