Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry

Karvana, Ketut Gde Manik; Yazid, Setiadi; Syalim, Amril; Mursanto, Petrus

doi:10.1109/iwbis.2019.8935884

Cited by 37 publications

(14 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Datamining provides classi cation techniques that can be applied for churn prediction problem. A study on applying ve different classi cation techniques on a private Indonesia banking dataset was performed and proved that SVM performs well in prediction of churners [7].Though commercial banks take measures to analyse the customer information in their records, it is found that prevention of churning is challenging. An improved Fuzzy C means clustering algorithm has been proposed to understand the customer behaviour which in turn helps in identi cation of churners [8].…”

Section: Related Workmentioning

confidence: 99%

An Optimal Classifier discovery for diagnosing the account health in financial firms and a study of classifier performance on imbalanced data

Suguna,

Subhashini,

Lakshmanan

et al. 2022

Preprint

View full text Add to dashboard Cite

Customers are the backbone for any financial companies. The behaviour of customer changes over time and they disconnect when the services do not meet their expectations. Earning loyalty of the customer by providing remarkable services and adopting retention strategies are mandatory to run any user centric businesses. In view of the growth perception there is need for company to identify the churn and avoid them in time. The mechanism for churn prediction requires to explore the insight of data. Machine learning algorithms are capable of mining the patterns present in the data and able to discriminate between classes with statistical learning. A Standard bank dataset has been considered for the study and exploratory data analysis performed to understand the nature of the data. Suitable data pre-processing is done and training data split from dataset has been used to build classifier models. The dataset was found to be imbalanced and by adopting appropriate sampling the dataset was balanced. Linear, Non-linear and boosting classifiers were built and their performances on test data are summarized. A comparative study on the classifier performance for both imbalanced as well as balanced dataset was observed and an optimal classifier for diagnosing customer account health has been suggested.

show abstract

Section: Related Workmentioning

confidence: 99%

An Optimal Classifier discovery for diagnosing the account health in financial firms and a study of classifier performance on imbalanced data

Suguna,

Subhashini,

Lakshmanan

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Churn customer forecasting is an activity performed to predict whether a customer will leave the company. In addition, this was inspired by the fact that there are about 1.5 million bank churn customers annually, which is increasing each year [15]. Indeed, if it is difficult to win a customer, it is very easy on the other hand to lose him.…”

Section: Introductionmentioning

confidence: 99%

“…The most commonly used metrics include accuracy first, then specificity, sensitivity, precision, and recall which are often compromised by the f1 score, and finally, the AUC (area under the curve). Only three studies have balanced the data [15,30,34] and four others explained their models by the importance of features [29,31,35,38] but none used Shapley values or an approach involving both data balancing and model explanation.…”

mentioning

confidence: 99%

Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods

et al. 2022

View full text Add to dashboard Cite

The diversity of data collected on both social networks and digital interfaces is extremely increased, raising the problem of heterogeneous variables that are not often favourable to classification algorithms. Despite the significant improvement in machine learning (ML) and predictive analysis efficiency for classification in customer relationship management systems (CRM), their performance remains very limited by heterogeneous data processing, class imbalance, and feature scales. This impact turned out to be more important for simple ML methods which in addition often suffer from over-fitting. This paper proposes a succinct and detailed ML model building process including cross-validation of the combination of SMOTE to balance data and ensemble methods for modelling. From the conducted experiments, the random forest (RF) model yielded the best performance of 0.86 in terms of accuracy and f1-scoreusing balanced data. It confirms the literature summary about this topic which shows that RF was among the most effective algorithms for customer predictive classification issues. The constructed and optimized models were interpreted by Shapley values and feature importance analysis which shows that the “age” feature was the most significant while “HasCrCard” was the less one. This process has proven effective in bridging previously reported research gaps and the resulting model should be used for supporting bank customer loyalty decision-making.

show abstract

“…Given that customers are the most valuable assets that have a direct impact on the revenue of the banking industry, customer churn is a source of major concern for service organizations [2]. It is therefore an important basic requirement that banks have good knowledge of customers' data, find factors that increase customer churn and take the necessary actions to reduce it [2,3]. The advancement of technology in the last few decades has made it possible for banks and many other service organizations to collect and store data about their customers and classify them into either the churner or non-churner categories.…”

Section: Introductionmentioning

confidence: 99%

Experimental Analysis of Hyperparameters for Deep Learning-Based Churn Prediction in the Banking Sector

2021

View full text Add to dashboard Cite

Until recently, traditional machine learning techniques (TMLTs) such as multilayer perceptrons (MLPs) and support vector machines (SVMs) have been used successfully for churn prediction, but with significant efforts expended on the configuration of the training parameters. The selection of the right training parameters for supervised learning is almost always experimentally determined in an ad hoc manner. Deep neural networks (DNNs) have shown significant predictive strength over TMLTs when used for churn predictions. However, the more complex architecture of DNNs and their capacity to process huge amounts of non-linear input data demand more time and effort to configure the training hyperparameters for DNNs during churn modeling. This makes the process more challenging for inexperienced machine learning practitioners and researchers. So far, limited research has been done to establish the effects of different hyperparameters on the performance of DNNs during churn prediction. There is a lack of empirically derived heuristic knowledge to guide the selection of hyperparameters when DNNs are used for churn modeling. This paper presents an experimental analysis of the effects of different hyperparameters when DNNs are used for churn prediction in the banking sector. The results from three experiments revealed that the deep neural network (DNN) model performed better than the MLP when a rectifier function was used for activation in the hidden layers and a sigmoid function was used in the output layer. The performance of the DNN was better when the batch size was smaller than the size of the test set data, while the RemsProp training algorithm had better accuracy when compared with the stochastic gradient descent (SGD), Adam, AdaGrad, Adadelta, and AdaMax algorithms. The study provides heuristic knowledge that could guide researchers and practitioners in machine learning-based churn prediction from the tabular data for customer relationship management in the banking sector when DNNs are used.

show abstract

Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry

Cited by 37 publications

References 6 publications

An Optimal Classifier discovery for diagnosing the account health in financial firms and a study of classifier performance on imbalanced data

An Optimal Classifier discovery for diagnosing the account health in financial firms and a study of classifier performance on imbalanced data

Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods

Experimental Analysis of Hyperparameters for Deep Learning-Based Churn Prediction in the Banking Sector

Contact Info

Product

Resources

About