Bankruptcy prediction is one of the most important research topics in
the area of accounting and finance. The rapid increase of data science,
artificial intelligence, and machine learning has led researchers to
develop an accurate bankruptcy prediction model. Recent studies show
that ensemble methods achieve better performance than traditional
machine learning models for predicting corporate failure, especially
with highly imbalanced datasets. However, the black box property of
these techniques remains difficult to interpret the result and generate
corporate classes without any explanation. To this end, we propose to
build an accurate and interpretable classification model that generates
a set of prediction rules for output. In this paper, a semi-supervised
Tri-eXtreme Gradient Boosting (Tri-XGBoost) is suggested. In the
proposed approach, three different xgboost methods are applied as the
weak classifiers (gbtree xgboost, gblinear xgboost, and dart xgboost)
combined with sampling methods such as Borderline-Smote (BLSmote) and
Random under-sampling (RUS) to balance the distribution of the datasets.
In addition, the xgboost is applied to choose the most important
features which increase the predictive accuracy. Finally, our result is
presented in the form of “IF-THEN” rules to enhance the
comprehensibility of the model by both applicants and experts. Our
proposed model is validated using the Polish bankruptcy imbalanced
datasets. The experimental results confirm the performance of our
proposed method compared to the existing methods with an AUC, G-mean and
F1-score ranging from 91% to 97%.