2023
DOI: 10.1007/s10994-022-06296-4
|View full text |Cite
|
Sign up to set email alerts
|

A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning

Abstract: Class imbalance occurs when the class distribution is not equal. Namely, one class is under-represented (minority class), and the other class has significantly more samples in the data (majority class). The class imbalance problem is prevalent in many real world applications. Generally, the under-represented minority class is the class of interest. The synthetic minority over-sampling technique (SMOTE) method is considered the most prominent method for handling unbalanced data. The SMOTE method generates new s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 70 publications
(19 citation statements)
references
References 60 publications
0
19
0
Order By: Relevance
“…Information gain involves entropy to find the best term. If the information gain value of a term is greater, then the feature is considered more significant and more important [28].…”
Section: Feature Selectionmentioning
confidence: 99%
“…Information gain involves entropy to find the best term. If the information gain value of a term is greater, then the feature is considered more significant and more important [28].…”
Section: Feature Selectionmentioning
confidence: 99%
“…Oversampling by creating new instances can produce unsatisfactory results if the newly formed samples are wrongly thought to be part of the minority class based only on their closeness to existing minority examples. When dealing with imbalanced datasets, it is important to avoid conflating the usage of simulated data with oversampling class-imbalanced data [53,54].…”
Section: Datamentioning
confidence: 99%
“…In this paper we have utilized total of two different ML's to explore, explain and model the behavior of brucellosis disease with respect to several input parameters 12 . The ML's are selected bases on their suitability and acceptability are given below:…”
Section: Machine Learning Modelsmentioning
confidence: 99%