2022
DOI: 10.1007/978-3-031-15471-3_32
|View full text |Cite
|
Sign up to set email alerts
|

Assessment of Creditworthiness Models Privacy-Preserving Training with Synthetic Data

Abstract: Credit scoring models are the primary instrument used by financial institutions to manage credit risk. The scarcity of research on behavioral scoring is due to the difficult data access. Financial institutions have to maintain the privacy and security of borrowers' information refrain them from collaborating in research initiatives. In this work, we present a methodology that allows us to evaluate the performance of models trained with synthetic data when they are applied to real-world data. Our results show t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 22 publications
(13 reference statements)
1
3
0
Order By: Relevance
“…However, there are some concerns regarding the application of such models. Considering that our results are consistent with previous studies [38,39], the GAN-based synthesizers are expected to require longer computation durations than the TVAE across almost all epochs. Meanwhile, the TVAE generated no variance data for several variables during the shorter epochs (5-30 epochs).…”
Section: Discussionsupporting
confidence: 92%
“…However, there are some concerns regarding the application of such models. Considering that our results are consistent with previous studies [38,39], the GAN-based synthesizers are expected to require longer computation durations than the TVAE across almost all epochs. Meanwhile, the TVAE generated no variance data for several variables during the shorter epochs (5-30 epochs).…”
Section: Discussionsupporting
confidence: 92%
“…Synthetic data can be used instead of real data to training models. Muñoz-Cancino and others (Muñoz-Cancino et al, 2022) made an attempt to evaluate creditworthiness assessment models based on realworld data and synthetic data generated by CTGAN and TVAE. The authors concluded that, despite results that were not fully satisfactory, by protecting the data privacy, such an action could boost cooperation between financial institutions and academia.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Autoencoders are unsupervised learning methods that are used in particular to deal with two analytical problems: dimensionality reduction and synthetic data (Muñoz-Cancino et al, 2022). Autoencoders are composed of two parts, an encoder and a decoder.…”
Section: Tvaementioning
confidence: 99%
See 1 more Smart Citation