2015
DOI: 10.1007/s11571-015-9350-4
|View full text |Cite
|
Sign up to set email alerts
|

Integrating new data balancing technique with committee networks for imbalanced data: GRSOM approach

Abstract: To deal with imbalanced data in a classification problem, this paper proposes a data balancing technique to be used in conjunction with a committee network. The proposed data balancing technique is based on the concept of the growing ring self-organizing map (GRSOM) which is an unsupervised learning algorithm. GRSOM balances the data through growing new data on a well-defined ring structure, which is iteratively developed based on the winning node nearby the samples. Accordingly, the new balanced data still pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(12 citation statements)
references
References 34 publications
0
12
0
Order By: Relevance
“…A one-way analysis of variance was used to compare the differences between the metabolome and nonmetabolome biomarkers in routine physical examination. The random sampling method was used to deal with the sample imbalance between workers with and without MS (19). The area under the receiving operating characteristic curve (AUC), true positive rate (also called sensitivity or recall), and false positive rate (specificity) are represented in a graphical plot.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…A one-way analysis of variance was used to compare the differences between the metabolome and nonmetabolome biomarkers in routine physical examination. The random sampling method was used to deal with the sample imbalance between workers with and without MS (19). The area under the receiving operating characteristic curve (AUC), true positive rate (also called sensitivity or recall), and false positive rate (specificity) are represented in a graphical plot.…”
Section: Discussionmentioning
confidence: 99%
“…The established risk assessment model revealed that the main risk biomarkers were absolute basophil count (OR: 3.38, CI:1.05-6.85), platelet packed volume (OR: 2.63, CI:2.31-3.79), leukocyte count (OR: 2.01, CI:1.79-2. 19), red blood cell count (OR: 1.99, CI:1.80-2.71), and alanine aminotransferase level (OR: 1.53, CI:1.12-1.98). Furthermore, favorable results…”
Section: Methodsmentioning
confidence: 99%
“…RQ4 can be divided into two more specific questions: RQ4.1, which concerns whether the research topics addressed by SLR articles in the current literature are comparatively few, and RQ4.2, [122] N N N Y 1.0 [55] N P N Y 1.5 [123] N P N Y 1.5 [124] N P N Y 1.5 [125] N P N Y 1.5 [104] N N N Y 1.0 [69] N N N Y 1.0 [126] N N N Y 1.0 [84] N P N Y 1.5 [127] N P N Y 1.5 [128] N P N Y 1.5 [129] N P N Y 1.5 which concerns the evidence that SLR studies on data preprocessing are lacking due to a lack of primary studies. A large number of studies in the existing literature have addressed data-related issues with an emphasis on data preprocessing.…”
Section: What Are the Limitations Of Current Research?mentioning
confidence: 99%
“…This problem has attracted significant attention 5 from the research community over the past years [16], and solutions addressing such a problem can be broadly categorised into data-level and algorithm-level methods [10]. Data More recent resampling methods include k-means clustering [11,27], densitybased clustering [4, 27,6], neural networks [8], and ensemble [38]. These methods are designed to produce better data distribution.…”
Section: Introductionmentioning
confidence: 99%
“…https://github.com/fonkafon/NB-undersampling Results.git8 InFig. 6, SMOTE has similar performance in sensitivity to kmUnder (hence the line is not visible)…”
mentioning
confidence: 99%