Credit card fraud is a growing problem nowadays and it has escalated during COVID-19 due to the authorities in many countries requiring people to use cashless transactions. Every year, billions of Euros are lost due to credit card fraud transactions, therefore, fraud detection systems are essential for financial institutions. As the classes’ distribution is not equally represented in the credit card dataset, the machine learning trains the model according to the majority class which leads to inaccurate fraud predictions. For that, in this research, we mainly focus on processing unbalanced data by using an under-sampling technique to get more accurate and better results with different machine learning algorithms. We propose a framework that is based on clustering the dataset using fuzzy C-means and selecting similar fraud and normal instances that have the same features, which guarantees the integrity between the data features.
Recent developments in e-payment systems have led to increased financial fraud, such as credit card fraud. It is therefore essential to implement detection mechanisms for credit card fraud. However, due to the unbalanced class distribution in the credit card dataset, the machine learning techniques are used to train the model based on the majority class, resulting in inaccurate fraud predictions. Therefore, this paper mainly focuses on processing unbalanced data using the oversampling technique called Elbow Fuzzy Noice filtering SMOTE (EFN-SMOTE). This method divides the dataset into multiple clusters. The number of clusters is determined by an algorithm known as the Elbow method, after which noise filtering is applied to each cluster, after that, we use the SMOTE in each cluster to synthesize a new minority instance based on the nearest majority instance of each minority instance to effectively perceive the decision boundary which leads to a balanced database by oversam-pling technique. On the other hand, the result show that EFN-SMOTE achieved better classification performance using Artificial Neural Network (ANN) with four hidden layers with 0.999 accuracy, 0.998 precision, 0.999 sensitivity, 0.998 specificity, 0.999 F-measure, and 0.999 G-Mean.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.