Purpose
The authors develop a framework to build an early warning mechanism in detecting financial deterioration of Chinese companies. Many studies in the financial distress and bankruptcy prediction literature rarely do they examine the impact of pre-processing financial indicators on the prediction performance. The purpose of this paper is to address this shortcoming.
Design/methodology/approach
The proposed framework is evaluated by using both original and discretized data, and a least absolute shrinkage and selection operator (LASSO) selection technique for choosing an appropriate subset of financial ratios for improved predictive performance. The financial ratios are then analyzed by five different data mining techniques. Managerial insights, using data from Chinese companies, are revealed by the methodology employed.
Findings
The prediction accuracy increases after we discretized the continuous variables of financial ratios. A better prediction performance can be achieved by including fewer, but relatively more significant variables. Random forest has the highest overall performance following closely by SVM and neural network.
Originality/value
The contribution of this study is fourfold. First, the authors add to the literature on defaults by showing variable discretization to be an essential pre-processing step to improve the prediction performance for classification problems. Second, the authors demonstrate that machine learning approaches can achieve better performance than traditional statistical methods in classification tasks. Third, the authors provide the evidence for the adoption of C5.0 over other methods because rules generated with C5.0 provide managerial insights for managers. Finally, the authors demonstrate the effectiveness of the LASSO technique for identifying the most important financial ratios from each category, enabling one to build better predictive models.