2015
DOI: 10.1002/rhc3.12071
|View full text |Cite
|
Sign up to set email alerts
|

Data Sample Selection Issues for Bankruptcy Prediction

Abstract: Bankruptcy prediction is of paramount interest to both academics and practitioners. This paper devotes special care to an important aspect of the bankruptcy prediction modeling: Data sample selection issue. To investigate the effect of the different data selection methods, three models are adopted: Logistic regression model, Neural Networks (NNET), and Support Vector Machines (SVM), which have recently gained some popularity in the applications. A Monte Carlo simulation study and an empirical analysis on an up… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 35 publications
0
10
0
Order By: Relevance
“…Deep analyses were taken for exploiting the statistical methods such as Partial Least Square Discriminant Analysis (PLS-DA) [7]. Multicollinearity [8] strategies were also utilized to meet the requirements of econometrics with better data availability.…”
Section: Related Workmentioning
confidence: 99%
“…Deep analyses were taken for exploiting the statistical methods such as Partial Least Square Discriminant Analysis (PLS-DA) [7]. Multicollinearity [8] strategies were also utilized to meet the requirements of econometrics with better data availability.…”
Section: Related Workmentioning
confidence: 99%
“…Having established this definition of failure, it is necessary to set guidelines for constructing the data samples from which to extract the models. The data collection is central to any prediction model because it informs the preliminary investigation of the data and also facilitates effective model construction (Tian et al, 2015). The review of prior literature reveals two main approaches to data collection.…”
Section: Data Sample Selectionmentioning
confidence: 99%
“…Yet a balanced distribution also can have detrimental effects on data size, due to the scarcity of bankrupt firms, inconsistent bankruptcy rates and a lack of accessibility to these firms' information, such that it is difficult and costly to gather information about failed firms (Tian et al, 2015). For a data set built using a balanced distribution, in which failed firms are paired with firms that did not fail, the capacity to collect failed firms becomes a key condition for the data size.…”
Section: Data Sample Selectionmentioning
confidence: 99%
“…Piri et al [58] use a synthetic informative minority oversampling algorithm to enhance SVM performance with an imbalanced dataset. Tian et al [59] claim that different sampling techniques are required depending on the purpose of the study. Song and Peng [60] suggest a multi-criteria decision making-based approach.…”
Section: Other Studiesmentioning
confidence: 99%