2016
DOI: 10.1016/j.ejor.2015.09.014
|View full text |Cite
|
Sign up to set email alerts
|

An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market

Abstract: Highlights• We evaluate default prediction performance of machine learning/regression models.• Including boosted trees, random forests, penalised linear/semi-parametric logistic regression.• Using data on over 300,000 residential mortgage loans.• The results indicate varying degrees of predictive power.• Statistical tests suggest boosted regression trees outperform penalised logistic regression.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
29
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 77 publications
(30 citation statements)
references
References 52 publications
1
29
0
Order By: Relevance
“…A lot of research effort has been committed to evaluating classification algorithms in credit scoring, ranging from traditional statistical methods, such as logistic regression [1], to non-parametric algorithms, such as neural networks [9]. In the recent years there has been an increased interest for using hybrid and ensemble classifiers in credit risk, such as boosted regression trees, random forests, deep learning methods and other [10], [11], [12], [13] [14]. A number of benchmark studies have been performed, comparing classification accuracy of different classification algorithms [15], [8].…”
Section: Background Workmentioning
confidence: 99%
“…A lot of research effort has been committed to evaluating classification algorithms in credit scoring, ranging from traditional statistical methods, such as logistic regression [1], to non-parametric algorithms, such as neural networks [9]. In the recent years there has been an increased interest for using hybrid and ensemble classifiers in credit risk, such as boosted regression trees, random forests, deep learning methods and other [10], [11], [12], [13] [14]. A number of benchmark studies have been performed, comparing classification accuracy of different classification algorithms [15], [8].…”
Section: Background Workmentioning
confidence: 99%
“…Fitzpatrick and Mues () evaluate the performance of several modeling approaches for determining future mortgage default status. Boosted regression trees, random forests, penalized linear and semi‐parametric logistic regression models are applied to four portfolios covering 300,000 Irish owner–occupier mortgages.…”
Section: Previous Studiesmentioning
confidence: 99%
“…Our study is a shift from the previous (public) machine learning studies on consumer loans as our dataset consist of extensive amounts of real data. Other machine learning-based studies conducted to predict defaults in consumer loans include Khandani et al [4], Butaru et al [5], and Fitzpatrick and Mues [6]. Khandani et al [4] used generalized classification and regression trees to construct nonlinear, nonparametric forecasting models of consumer credit risk by combining customer transactions and credit bureau scores.…”
Section: Introduction and Literature Reviewmentioning
confidence: 99%
“…Butaru et al [5] applied logistic regression, decision trees using the C4.5 algorithm, and the random forests methods to combined consumer trade-line, credit-bureau, and macroeconomic variables to predict delinquency. Fitzpatrick and Mues evaluated the performance of logistic regression, semiparametric generalised additive models, boosted regression trees, and random forests for future mortgage default status [6]. These studies aim to give an overview of the objectives, techniques, and difficulties of credit scoring as an application of forecasting.…”
Section: Introduction and Literature Reviewmentioning
confidence: 99%