2016 SAI Computing Conference (SAI) 2016
DOI: 10.1109/sai.2016.7556019
|View full text |Cite
|
Sign up to set email alerts
|

Handling class imbalance in direct marketing dataset using a hybrid data and algorithmic level solutions

Abstract: Abstract-Class imbalance is a major problem in machine learning. It occurs when the number of instances in the majority class is significantly more than the number of instances in the minority class. This is a common problem which is recurring in most datasets, including the one used in this paper (i.e. direct marketing dataset). In direct marketing, businesses are interested in identifying potential buyers, or charities wish to identify potential givers. Several solutions have been suggested in the literature… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 17 publications
0
8
0
Order By: Relevance
“…In fact, the more the ML techniques have evolved over time, the more the schemes for spreading fake news have changed. • We apply re-sampling techniques, such as under-sampling and over-sampling, due to the class imbalance of the realworld dataset [14,15]. Disproportion between classes still represents an open issue and challenge for researchers focused on classification problems.…”
Section: Introductionmentioning
confidence: 99%
“…In fact, the more the ML techniques have evolved over time, the more the schemes for spreading fake news have changed. • We apply re-sampling techniques, such as under-sampling and over-sampling, due to the class imbalance of the realworld dataset [14,15]. Disproportion between classes still represents an open issue and challenge for researchers focused on classification problems.…”
Section: Introductionmentioning
confidence: 99%
“…HybridDA amalgamates SMOTE oversampling, random undersampling (RUS), and SVM optimization utilizing grid search [20]. This method combines both data level and algorithm level approaches, using the data level for generating samples and the algorithm level for optimization.…”
Section: Introductionmentioning
confidence: 99%
“…The class imbalance problem in direct marketing is usually solved by using one of the following three approaches: data-based approaches [7,8], algorithm-based approaches [9], or cost-based approaches [10]. Namely, the data-based approach balances classes using resampling techniques; algorithm-based solutions are based on specifically modified algorithms, while cost-based approaches allocate different misclassification costs to different class examples [11].…”
Section: Introductionmentioning
confidence: 99%