Customer retention is a challenging and critical issue in telecommunication and service-based sectors. Various researchers have established the need for a service-based company to retain their existing customers much cheaper than acquiring new ones. However, the predictive models for observing customers’ behavior is one of the great instruments in the customer retention process and inferring the future behavior of the customers. Selecting the right and best model is another herculean task because the performances of predictive models are greatly affected when the real-world dataset is highly imbalanced. The study analyses the performance of homogeneous ensembles; bagging, boosting, rotation forest, cascade, and dagging. These ensembles were applied to both raw and balanced datasets to compare the performance of the models. The data sampling method (oversampling) was adopted to balance the raw dataset. The primary metric used for the evaluation of the performance of the models was Accuracy and ROC/AUC (Receiver Operating Characteristics/Area Under Curve). Weka 3.8.5 machine learning tool used to analyze and develop the models. The study reveals that Bagging had the best performance having an AUC of 0.987, followed by boosting and Rotation Forest both with an AUC of 0.985.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.