2021
DOI: 10.17798/bitlisfen.939733
|View full text |Cite
|
Sign up to set email alerts
|

Combination of PCA with SMOTE Oversampling for Classification of High-Dimensional Imbalanced Data

Abstract: Imbalanced data classification is a common issue in data mining where the classifiers are skewed towards the larger data class. Classification of high-dimensional skewed (imbalanced) data is of great interest to decisionmakers as it is more difficult to. The dimension reduction method, a process in which variables are reduced, allows high dimensional datasets to be interpreted more easily with a certain loss. This study, a method combining SMOTE oversampling with principal component analysis is proposed to sol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 19 publications
0
8
0
Order By: Relevance
“…In datasets with class imbalance problem, most machine learning techniques ignore minority class performance and therefore underperform in minority class. One approach to these datasets is to oversample the minority class and is called the Synthetic Minority Oversampling Technique, or SMOTE for short (9). In order to eliminate the class imbalance problem in the colon cancer gene expression dataset (22 normal and 40 tumor tissues), the SMOTE method was applied before feature selection.…”
Section: Data Preprocessing and Modelingmentioning
confidence: 99%
“…In datasets with class imbalance problem, most machine learning techniques ignore minority class performance and therefore underperform in minority class. One approach to these datasets is to oversample the minority class and is called the Synthetic Minority Oversampling Technique, or SMOTE for short (9). In order to eliminate the class imbalance problem in the colon cancer gene expression dataset (22 normal and 40 tumor tissues), the SMOTE method was applied before feature selection.…”
Section: Data Preprocessing and Modelingmentioning
confidence: 99%
“…It is the process of recognizing patterns, concepts, and other objects in order to better comprehend them and classify them based on incoming data [5]. Classification can help uncover abnormalities when developing a learning model from prior data [4]. There are various classification algorithms, each of which builds a prediction model in a different way.…”
Section: Classification Algorithmsmentioning
confidence: 99%
“…It can be used to create guesses regarding category variable names [34]. Each branch might be relegated to the training sample category [4]. The decision tree is formulated as follows:…”
Section: Decision Tree (Dt)mentioning
confidence: 99%
See 2 more Smart Citations