Empowering Imbalanced Data in Supervised Learning: A Semi-supervised Learning Approach

Almogahed, Bassam; Kakadiaris, Ioannis A.

doi:10.1007/978-3-319-11179-7_66

Cited by 4 publications

(3 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The rise of the likelihood for overfitting is the main drawback of random over-sampling techniques due to the replicating of minority instances (Almogahed and Kakadiaris, 2014[8]). Chawla et al (2002[23]) proposed the Synthetic Minority Oversampling Technique (SMOTE) which is done by creating synthetic examples rather than by over-sampling with replacement.…”

Section: Cornerstones Of a Cad Systemmentioning

confidence: 99%

“…The minority class is over-sampled by taking each minority class sample and introducing synthetic examples along the line segments joining any/all of the k minority class nearest neighbours. SMOTE is an effective oversampling technique which has some deficiency such as over-generation because the generation of synthetic samples increases the classes overlapping (Almogahed and Kakadiaris, 2014[8]). Over-generation is problematic in the case of skewed class distribution with sparse minority class versus majority class (Maciejewski and Stefanowski, 2011[90]) .…”

Section: Cornerstones Of a Cad Systemmentioning

confidence: 99%

See 1 more Smart Citation

Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection

Jalalian

Mashohor

Mahmud

et al. 2017

EXCLI Journal; 16:Doc113; ISSN 1611-2156

View full text Add to dashboard Cite

Section: Cornerstones Of a Cad Systemmentioning

confidence: 99%

Section: Cornerstones Of a Cad Systemmentioning

confidence: 99%

Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection

Jalalian

Mashohor

Mahmud

et al. 2017

EXCLI Journal; 16:Doc113; ISSN 1611-2156

View full text Add to dashboard Cite

“…However, it has been argued that random under-sampling may lose some relevant information, while randomly over-sampling with replacement the smallest class may lead to overfitting (Almogahed and Kakadiaris 2014). More sophisticated sampling techniques may allow to avoid these drawbacks.…”

Section: Introductionmentioning

confidence: 99%

Matrix sketching for supervised classification with imbalanced classes

Falcone

Anderlucci

Montanari

2021

Data Min Knowl Disc

View full text Add to dashboard Cite

The presence of imbalanced classes is more and more common in practical applications and it is known to heavily compromise the learning process. In this paper we propose a new method aimed at addressing this issue in binary supervised classification. Re-balancing the class sizes has turned out to be a fruitful strategy to overcome this problem. Our proposal performs re-balancing through matrix sketching. Matrix sketching is a recently developed data compression technique that is characterized by the property of preserving most of the linear information that is present in the data. Such property is guaranteed by the Johnson-Lindenstrauss’ Lemma (1984) and allows to embed an n-dimensional space into a reduced one without distorting, within an $$\epsilon $$ ϵ -size interval, the distances between any pair of points. We propose to use matrix sketching as an alternative to the standard re-balancing strategies that are based on random under-sampling the majority class or random over-sampling the minority one. We assess the properties of our method when combined with linear discriminant analysis (LDA), classification trees (C4.5) and Support Vector Machines (SVM) on simulated and real data. Results show that sketching can represent a sound alternative to the most widely used rebalancing methods.

show abstract

Diagnosis system for imbalanced multi-minority medical dataset

Shilaskar

Ghatol

2018

Soft Comput

View full text Add to dashboard Cite

Empowering Imbalanced Data in Supervised Learning: A Semi-supervised Learning Approach

Cited by 4 publications

References 13 publications

Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection

Foundation and methodologies in computer-aided diagnosis systems for breast cancer detection

Matrix sketching for supervised classification with imbalanced classes

Diagnosis system for imbalanced multi-minority medical dataset

Contact Info

Product

Resources

About