2015
DOI: 10.1371/journal.pone.0117844
|View full text |Cite
|
Sign up to set email alerts
|

Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

Abstract: Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

1
56
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 89 publications
(60 citation statements)
references
References 50 publications
1
56
0
1
Order By: Relevance
“…These diverse classifiers can be combined using a stacked ensemble approach, which involves using a training data set to empirically fit a function that combines the classifiers to produce a single classification result 12. Depending on the outcome to be classified, stacked ensemble approaches have used a variety of linear, polynomial and logistic regression models to integrate the individual classifiers into a single prediction 11 13. This approach provides an opportunity to combine the information from multiple job components, which may vary in their relative strength of association with the occupation classification, to identify the best occupation classification.…”
Section: Introductionmentioning
confidence: 99%
“…These diverse classifiers can be combined using a stacked ensemble approach, which involves using a training data set to empirically fit a function that combines the classifiers to produce a single classification result 12. Depending on the outcome to be classified, stacked ensemble approaches have used a variety of linear, polynomial and logistic regression models to integrate the individual classifiers into a single prediction 11 13. This approach provides an opportunity to combine the information from multiple job components, which may vary in their relative strength of association with the occupation classification, to identify the best occupation classification.…”
Section: Introductionmentioning
confidence: 99%
“…This type of threelayer MLP is a commonly adopted ANN structure for binary classification problems such as bankruptcy prediction [11]. Although the characteristics of ANN ensembles, such as efficiency, robustness, and adaptability, make them a valuable tool for classification, decision support, financial analysis, and credit scoring, it should be noted that some researchers have shown that the ensembles of multiple neural network classifiers are not always superior to a single best neural network classifier [17]. Hence, we focus on applying a single neural network model to bankruptcy prediction.…”
Section: Introductionmentioning
confidence: 99%
“…However, there are a few reported works on feature selection methods technically designed to handle the problem of imbalanced class distribution; see references . Least absolute shrinkage and selection operator (LASSO) is very popular for variable selection and is widely used for conducting complex data, but just a few of works are directly done for dealing with class‐imbalanced data . Another preprocessing strategy for addressing class‐imbalanced data is to rebalance the class distribution by sampling .…”
Section: Introductionmentioning
confidence: 99%
“…3,[19][20][21] Least absolute shrinkage and selection operator (LASSO) 22 is very popular for variable selection and is widely used for conducting complex data, but just a few of works are directly done for dealing with class-imbalanced data. 23,24 Another preprocessing strategy for addressing class-imbalanced data is to rebalance the class distribution by sampling. 25 This rebalance can be done by either utilizing undersampling or oversampling to equilibrate the sample size of two classes in training data, such as synthetic minority oversampling technique (SMOTE).…”
Section: Introductionmentioning
confidence: 99%