2022
DOI: 10.1088/1742-6596/2161/1/012072
|View full text |Cite
|
Sign up to set email alerts
|

Detection of fraudulent credit card transactions: A comparative analysis of data sampling and classification techniques

Abstract: Every year there is an increasing loss of a huge amount of money due to fraudulent credit card transactions. Recently there is a focus on using machine learning algorithms to identify fraud transactions. The number of fraud cases to non-fraud transactions is very low. This creates a skewed or unbalanced data, which poses a challenge to training the machine learning models. The availability of a public dataset for this research problem is scarce. The dataset used for this work is obtained from Kaggle. In this p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…Tomek links corresponding instances of the opposite class that are the nearest neighbors to each other to provide better class separation to the decision borders. The results were improved to 99% compared with 94% in RUS [27,30].…”
Section: Class Imbalance Challengementioning
confidence: 86%
“…Tomek links corresponding instances of the opposite class that are the nearest neighbors to each other to provide better class separation to the decision borders. The results were improved to 99% compared with 94% in RUS [27,30].…”
Section: Class Imbalance Challengementioning
confidence: 86%
“…The authors noticed that oversampling techniques improved the performance of the models and claimed that there was no preference for one oversampling technique over another, as everything depended on the type of machine learning (ML) algorithm being used. Other researchers [5] explored different undersampling techniques using SMOTE and SMOTE-Tomek for unbalanced data. The classification models used in this study (KNN, LR, RF, and SVM) were trained on balanced data to detect fraudulent credit card transactions.…”
Section: Related Workmentioning
confidence: 99%
“…The main data balancing methods include changing data distribution and improving the algorithm's level. The former involves over-sampling algorithms, such as SMOTE [5] and ADASYN [6], combined with under-sampling algorithms [7][8][9][10][11][12], which generate new samples.…”
Section: Introductionmentioning
confidence: 99%