Although microfinance organizations play an important role in developing economies, decision support models for microfinance credit scoring have not been sufficiently covered in the literature, particularly for microcredit enterprises. The aim of this paper is to create a three‐class model that can improve credit risk assessment in the microfinance context. The real‐world microcredit data set used in this study includes data from retail, micro, and small enterprises. To the best of the authors' knowledge, existing research on microfinance credit scoring has been limited to regression and genetic algorithms, thereby excluding novel machine learning algorithms. The aim of this research is to close this gap. The proposed models predict default events by analysing different ensemble classification methods that empower the effects of the synthetic minority oversampling technique (SMOTE) used in the preprocessing of the imbalanced microcredit data set. Initial results have shown improvement in the prediction results for certain classes when the oversampling technique with homogeneous and heterogeneous ensemble classifier methods was applied. A prediction improvement for all classes was achieved via application of SMOTE and the Consolidated Trees Construction algorithm together with Rotation Forest. To obtain a complete view of all aspects, an additional set of metrics is used in the evaluation of performance.
Summary
Credit scoring is one the most important parts of credit risk management in reducing the risk of client defaults and bankruptcies. Deep learning has received much attention in recent years, but it has not been implemented so intensively in credit scoring compared to other financial domains. In this article, stacked unidirectional and bidirectional LSTM (long short‐term memory) networks as a complex area of deep learning are applied in solving credit scoring problems for the first time. The proposed robust model exploits the full potential of the three‐layer stacked LSTM and BDLSTM (bidirectional LSTM) architecture with the treatment and modeling of public datasets in a novel way since credit scoring is not a time sequence problem. Attributes of each loan instance were transformed into a sequence of the matrix with a fixed sliding window approach with a one‐time step. Our proposed models outperform existing and much more complex deep learning solutions thus we succeeded in preserving simplicity. In this article, measures of different types are employed to carry out consistent conclusions. The results by applying three hidden layers on the German Credit dataset showed an accuracy of 87.19%, for Kaggle dataset accuracy reached 93.69%, and for Microcredit dataset accuracy of 97.80%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.