2023
DOI: 10.1186/s40537-023-00684-w
|View full text |Cite
|
Sign up to set email alerts
|

The effect of feature extraction and data sampling on credit card fraud detection

Abstract: Training a machine learning algorithm on a class-imbalanced dataset can be a difficult task, a process that could prove even more challenging under conditions of high dimensionality. Feature extraction and data sampling are among the most popular preprocessing techniques. Feature extraction is used to derive a richer set of reduced dataset features, while data sampling is used to mitigate class imbalance. In this paper, we investigate these two preprocessing techniques, using a credit card fraud dataset and fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 36 publications
(7 citation statements)
references
References 45 publications
0
6
0
1
Order By: Relevance
“…Furthermore, Salekshahrezaee et al [43] aimed to develop robust CCFD models by studying the impact of feature extraction and data resampling on the following machine learning classifiers: CatBoost, random forest, extreme gradient boosting (XGBoost), and LightGBM. The feature extraction was achieved using principal component analysis (PCA) and convolutional autoencoder (CAE), while the data resampling methods include SMOTE, random undersampling (RUS), and SMOTE Tomek techniques.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, Salekshahrezaee et al [43] aimed to develop robust CCFD models by studying the impact of feature extraction and data resampling on the following machine learning classifiers: CatBoost, random forest, extreme gradient boosting (XGBoost), and LightGBM. The feature extraction was achieved using principal component analysis (PCA) and convolutional autoencoder (CAE), while the data resampling methods include SMOTE, random undersampling (RUS), and SMOTE Tomek techniques.…”
Section: Related Workmentioning
confidence: 99%
“…To detect credit card fraud, Salekshahrezaee et al [15] utilized four integrated learning classifiers based on Decision Tree (DT) classifiers in conjunction with distinct feature selection approaches. Also, their dataset was produced by the Kaggle community.…”
Section: Related Workmentioning
confidence: 99%
“…See the references [20][21][22] for data, R code and an analytic method of COVID data respectively. The harmonic mean is used to portray the uncertainty panarthropod relationship [23] in multicriteria optimization [24], in feature extraction of credit card frauds [25], in Bayesian analysis of atherosclerosis cardiovascular data analysis [26], and in hierarchical taxonomy [27], among many other scientific applications. In other words, the number of zero counts is equal to n − n Biased , where n is the number of entries in the original data.…”
Section: Incidence-rate-restricted Poisson Model With a Complementary...mentioning
confidence: 99%