Imbalanced classification on bankruptcy prediction is considered as one of the most important topics in financial institutions. In this context, various statistical and artificial intelligence methods have been proposed. Recently, deep learning algorithms are experiencing a resurgence of interest, and are widely used to build a prediction and classification models. To this end, we propose a novel deep learning-based approach called BSM-SAES. This approach combines Borderline Synthetic Minority oversampling technique (BSM) and Stacked AutoEncoder (SAE) based on the Softmax classifier. The aim is to develop an accurate and reliable bankruptcy prediction model which includes the features extraction process. To assess the classification performance of our proposed model, k-nearest neighbor, decision tree, support vector machine, and artificial neural network, C5.0 that are machine learning methods, are applied. We evaluate our proposed approach on the Polish imbalanced datasets. The obtained results confirm the efficiency of our proposed model compared to other machine learning models regarding predicting and classifying the financial status of a firm.
Bankruptcy prediction is one of the most important research topics in
the area of accounting and finance. The rapid increase of data science,
artificial intelligence, and machine learning has led researchers to
develop an accurate bankruptcy prediction model. Recent studies show
that ensemble methods achieve better performance than traditional
machine learning models for predicting corporate failure, especially
with highly imbalanced datasets. However, the black box property of
these techniques remains difficult to interpret the result and generate
corporate classes without any explanation. To this end, we propose to
build an accurate and interpretable classification model that generates
a set of prediction rules for output. In this paper, a semi-supervised
Tri-eXtreme Gradient Boosting (Tri-XGBoost) is suggested. In the
proposed approach, three different xgboost methods are applied as the
weak classifiers (gbtree xgboost, gblinear xgboost, and dart xgboost)
combined with sampling methods such as Borderline-Smote (BLSmote) and
Random under-sampling (RUS) to balance the distribution of the datasets.
In addition, the xgboost is applied to choose the most important
features which increase the predictive accuracy. Finally, our result is
presented in the form of “IF-THEN” rules to enhance the
comprehensibility of the model by both applicants and experts. Our
proposed model is validated using the Polish bankruptcy imbalanced
datasets. The experimental results confirm the performance of our
proposed method compared to the existing methods with an AUC, G-mean and
F1-score ranging from 91% to 97%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.