Business firms and households sometimes seek for extra-funding to fulfill certain needs. The demand which arises from the need of extra funds is fulfilled by the credit market. Banks and others financial lending institutions are the key players in this market (Gaigaliene and Cesnys, 2018). Loan is one of the most important products of most financial institutions. All financial lenders try to find effective business strategies for persuading customers to apply for loans. However, there are some borrowers who default in loan payments (Begum and Deniz, 2019). During a loan term, default may occur when the borrower fails to make required payments. Therefore, an assessment of a borrower's default risk over time is essential to enable timely risk management. Credit officers determine whether borrowers can fulfill their requirements using manually analysis of borrower's credit history. In the last decade, this trend has changed over time with technological advancement (Rehman, 2017).In recent years, financial lending institutions are using automated loan default models as credit risk scoring tools when granting loans to potential borrowers (Bao et al., 2019). Machine Learning (ML) algorithms have been applied to assess the credit risk of borrowers in financial lending institutions (Djeundj and Crook, 2018). Reliable models for credit risks play an important role in loss control and revenue maximization (Luo and Nie, 2016). Earlier research treated loan default prediction as a binary classification problem, where a loan is classified as either creditworthy or non-creditworthy (Rosenberg and Gleit, 1994). Linear Discriminant Analysis (LDA) and logistic regression (LR) are two most popular tools for constructing credit scoring models (Wiginton, 1980). Subsequently, other classification algorithm such as, Artificial neural networks (ANN) Gulsoy and Kulluk (2019) support vector machines (SVM) Alaka et al. (2018), decision trees (DT) Liu et al. (2015), and Bayesian classifier (BC) Carta et al. (2020), have been used to estimate borrowers' probability of default. Recently, time-to-default modeling has attracted increasing research interest (Dirick et al., 2017). Time-to-default data fall into the category of lifetime data in general, which is commonly analyzed by survival analysis (SA) (Malekipirbazari and Aksakalli, 2015). In loan prediction, two types of errors inevitably lead to inefficiency in prediction
Credit loans are considered most essential aspect of most financial institutions. All loan mortgagees or lenders are demanding to identify out effective commercial and business approaches to encourage customers to apply their credit loans. There are numerous business patrons who act negatively after their requests got approval. To avert this condition, lenders have to discover some techniques to forecast customer’s behaviors. This resulted to the usage of machine learning algorithms by the financial lending institutions for accessing loan applicants. Despite advancements in automating decision-based loan systems, most existing models do not consider the “early loan repayment” attribute as a factor in resolving this prediction error. In reality, the amendment for preliminary loan reimbursement in model building is obligatory, since a larger numbers of timely loan reimbursement observed during the loan period, reduces default rate. For effective model’s comparison based on accuracy and minimum errors of prediction, six supervised machine learning algorithms i.e. Random Forest, Artificial Neural Network, Classification and Regression Tree, Support Vector Machine, Logistic Regression, and Naïve Bayes were adopted to develop a default prediction models which include the early loan repayment attribute. The models were trained and tested on a loan dataset consisting of attributes with, and without early loan repayment attribute and were evaluated using five performance metrics. The results of the performance evaluation show that models that account for early loan repayment have higher accuracy, recall, precision, Root Mean Square Error and Receiver Operative Characteristics curve values than models trained without the early loan repayment attribute. The Random forest model proofed to be the best predictive model having 93% accuracy, 11% RMSE, 90% precision, 89% recall and 81% ROC value over others models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.