2022
DOI: 10.3390/info13120575
|View full text |Cite
|
Sign up to set email alerts
|

DEGAIN: Generative-Adversarial-Network-Based Missing Data Imputation

Abstract: Insights and analysis are only as good as the available data. Data cleaning is one of the most important steps to create quality data decision making. Machine learning (ML) helps deal with data quickly, and to create error-free or limited-error datasets. One of the quality standards for cleaning the data includes handling the missing data, also known as data imputation. This research focuses on the use of machine learning methods to deal with missing data. In particular, we propose a generative adversarial net… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(4 citation statements)
references
References 38 publications
0
4
0
Order By: Relevance
“…In their paper (Wang et al, 2021), the authors used pseudo-label conditional generative adversarial imputation networks (PC-GAIN) to deal with incomplete data problems. Shahbazian and Trubitsyna in their study (Shahbazian & Trubitsyna, 2022) proposed a novel method for handling missing data -DEGAIN -which is an improved version of generative adversarial imputation networks (GAIN) proposed by Yoon, Jordon and Van Der Schaar (Yoon et al, 2018). Compared to GAIN, the deconvolution concept was added to DEGAIN.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In their paper (Wang et al, 2021), the authors used pseudo-label conditional generative adversarial imputation networks (PC-GAIN) to deal with incomplete data problems. Shahbazian and Trubitsyna in their study (Shahbazian & Trubitsyna, 2022) proposed a novel method for handling missing data -DEGAIN -which is an improved version of generative adversarial imputation networks (GAIN) proposed by Yoon, Jordon and Van Der Schaar (Yoon et al, 2018). Compared to GAIN, the deconvolution concept was added to DEGAIN.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The first step we took was data cleaning. This involves identifying and fixing or removing incorrect, corrupt, incorrectly formatted, duplicate, or incomplete data within the dataset [26]. In our dataset, we identified and removed any invalid or missing values, which resulted in a clean dataset with 4700 observations of class 0 and 209 observations of class 1.…”
Section: Data Preprocessingmentioning
confidence: 99%
“…To facilitate this differentiation, the hint generator proffers a 'hint matrix', imparting salient cues to the discriminator regarding the provenance of the data. This ensures the maintenance of a dynamic equilibrium, precluding the generator from perpetually overshadowing the discriminator in performance [65][66][67].…”
Section: Data Pre-processingmentioning
confidence: 99%