Credit scoring is the core part of an institution’s lending. As artificial intelligence is used in various fields, credit rating is also under the same topic of accepting technological changes. Combining credit evaluation and machine learning can incorporate relatively comprehensive features into the credit evaluation process. Through the excellent performance of Catboost, while ensuring accuracy, it demonstrates the explainability of the model as much as possible, avoiding the traditional trust problem of the black-box model. Explainability is proposed to the machine learning model, which reduces the difficulty of processing large amounts of data and the threshold for non-professionals to understand the model. In this article, the dataset is the personal loan data of LendingClub obtained through python. By analyzing the data through Catboost, we can derive excellent results in applying the explainability of machine learning in personal credit evaluation.
Categorical Boost (CatBoost) is a new approach in credit rating. In the process of classification and prediction using CatBoost, parameter tuning and feature selection are two crucial parts, which affect the classification accuracy of CatBoost significantly. This paper proposes a novel SSA-CatBoost model, which mixes Sparrow Search Algorithm (SSA) and CatBoost to improve classification and prediction accuracy for credit rating. In terms of parameter tuning, the SSA-CatBoost optimization obtains the most optimal parameters by iterating and updating the sparrow’s position, and utilize the optimal parameter to improve the accuracy of classification and prediction. In terms of feature selection, a novel wrapping method called Recursive Feature Elimination algorithm is adopted to reduce the adverse impact of noise data on the results, and further improves calculation efficiency. To evaluate the performance of the proposed SSA-CatBoost model, P2P lending datasets are employed to assess the prediction results, then the interpretable Shap package is used to explain the reason why the proposed model considers a sample as good or bad. Consequently, the experimental results show that the SSA-CatBoost model has an ideal accuracy in classification and prediction for credit rating by comparing the SSA-CatBoost model with the CatBoost model and other well-known machine learning models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.