Abstract:Machine learning is an established method of predicting customer defection from a contractual business. However, no systematic comparison or evaluation of the different machine-learning techniques has been performed. In this study, we provide a comprehensive comparison of different machine-learning techniques with three different data sets of a software company to predict customer defection. The evaluation criteria of the techniques are understandability of the model, convenience of using the model, time effic… Show more
“…Al-Molhem et al [68] conducted social network analysis to enhance the results of churn prediction models in the telecom domain using call detail records to construct a weighed graph representing the distance between two subscribers to calculate the centrality. The usage of marketing related variables, such as promotions offered to a customer, calls developed in a retention strategy, and helpdesk interactions, were applied by Verbeke et al [50] However, certain studies did not identify the features employed [9,12,18,22,25,42,48,67,70,73,85,87,88,97,98,99,107,110,112,113,116,119,122,127], which in some cases are related to the usage of more than one database. Idris et al [119] used two databases (orange and cell2cell).…”
Section: Rq3 -What Are the Features Used To Predict Dropout?mentioning
confidence: 99%
“…In contractual setting scenarios, such as insurance, telecommunications, and magazine subscriptions, firms can accurately understand the cash flow generated by their customers, as customers usually sign long-term contracts with firms [21]. The customer must choose whether to opt-in or to opt-out [22], i.e., (1) customers will choose to opt-in if they want to enter into a contract with a particular form (e.g., renewal form) or (2) customers will choose to opt-out if they prefer not to renew.…”
Section: Introductionmentioning
confidence: 99%
“…Articles by adopted algorithms ,12,22,25,40,43,45,47,50,71,75,84,90,100,105,106,118,122,123,125] ,18,22,24,28,32,33,42,46,49,50,72,75,76,77,78,79,86,88,89,90,95,97,99,100,106,112,119,126,127] …”
Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.
“…Al-Molhem et al [68] conducted social network analysis to enhance the results of churn prediction models in the telecom domain using call detail records to construct a weighed graph representing the distance between two subscribers to calculate the centrality. The usage of marketing related variables, such as promotions offered to a customer, calls developed in a retention strategy, and helpdesk interactions, were applied by Verbeke et al [50] However, certain studies did not identify the features employed [9,12,18,22,25,42,48,67,70,73,85,87,88,97,98,99,107,110,112,113,116,119,122,127], which in some cases are related to the usage of more than one database. Idris et al [119] used two databases (orange and cell2cell).…”
Section: Rq3 -What Are the Features Used To Predict Dropout?mentioning
confidence: 99%
“…In contractual setting scenarios, such as insurance, telecommunications, and magazine subscriptions, firms can accurately understand the cash flow generated by their customers, as customers usually sign long-term contracts with firms [21]. The customer must choose whether to opt-in or to opt-out [22], i.e., (1) customers will choose to opt-in if they want to enter into a contract with a particular form (e.g., renewal form) or (2) customers will choose to opt-out if they prefer not to renew.…”
Section: Introductionmentioning
confidence: 99%
“…Articles by adopted algorithms ,12,22,25,40,43,45,47,50,71,75,84,90,100,105,106,118,122,123,125] ,18,22,24,28,32,33,42,46,49,50,72,75,76,77,78,79,86,88,89,90,95,97,99,100,106,112,119,126,127] …”
Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.
“…Machine learning approaches are now used in various areas and applications, such as image and speech recognition, natural language processing, as part of internet offers and sales (recommender systems), within banking and financial services (e.g., the detection of unusual financial transactions), within accounting and systems to uncover tax fraud, within medical and pharmaceutical processes, within transport and logistics (e.g., autonomous vehicles and the automation of logistics processes), the optimization of energy infrastructure, the optimalization of management, the optimization of various areas of business management (e.g., predictions of financial health, the optimization of supply processes, storage, the optimization of targeting of marketing tools, and the optimization of investment decisions), etc. [12,13].…”
Predictions of the unemployment duration of the economically active population play a crucial assisting role for policymakers and employment agencies in the well-organised allocation of resources (tied to solving problems of the unemployed, whether on the labour supply or demand side) and providing targeted support to jobseekers in their job search. This study aimed to develop an ensemble model that can serve as a reliable tool for predicting unemployment duration among jobseekers in Slovakia. The ensemble model was developed using real data from the database of jobseekers (those registered as unemployed and actively searching for a job through the Local Labour Office, Social Affairs, and Family) using the stacking method, incorporating predictions from three individual models: CART, CHAID, and discriminant analysis. The final meta-model was created using logistic regression and indicates an overall accuracy of the prediction of unemployment duration of almost 78%. This model demonstrated high accuracy and precision in identifying jobseekers at risk of long-term unemployment exceeding 12 months. The presented model, working with real data of a robust nature, represents an operational tool that can be used to check the functionality of the current labour market policy and to solve the problem of long-term unemployed individuals in Slovakia, as well as in the creation of future government measures aimed at solving the problem of unemployment. The measures from the state are financed from budget funds, and by applying the appropriate model, it is possible to arrive at the rationalization of the financing of these measures, or to specifically determine the means intended to solve the problem of long-term unemployment in Slovakia (this, together with the regional disproportion of unemployment, is considered one of the most prominent problems in the labour market in Slovakia). The model also has the potential to be adapted in other economies, taking into account country-specific conditions and variables, which is possible due to the data-mining approach used.
“…The case where dropout is developed has two main scenarios [2,3]: (1) Contractual settings, where customers pay a monthly fee and the customer informs the end of the relationship; and (2) non-contractual settings, where the organization has to extrapolate whether the customer is still active or not. In the contractual setting, the customer must choose whether they will dropout or not; for example, if they renew a contract or not [4]. This means that, in contractual settings, the customer dropout represents an explicit ending of a relationship that is more penalizing than that in non-contractual settings [5], which has implications for the profitability of organizations, increasing marketing costs and reducing sales [6].…”
Dropout prediction is a problem that must be addressed in various organizations, as retaining customers is generally more profitable than attracting them. Existing approaches address the problem considering a dependent variable representing dropout or non-dropout, without considering the dynamic perspetive that the dropout risk changes over time. To solve this problem, we explore the use of random survival forests combined with clusters, in order to evaluate whether the prediction performance improves. The model performance was determined using the concordance probability, Brier Score and the error in the prediction considering 5200 customers of a Health Club. Our results show that the prediction performance in the survival models increased substantially in the models using clusters rather than that without clusters, with a statistically significant difference between the models. The model using a hybrid approach improved the accuracy of the survival model, providing support to develop countermeasures considering the period in which dropout is likely to occur.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.