2019
DOI: 10.4018/978-1-5225-7519-1.ch011
|View full text |Cite
|
Sign up to set email alerts
|

Data Imputation Methods for Missing Values in the Context of Clustering

Abstract: Missing data is a common problem for data clustering quality. Most real-life datasets have missing data, which in turn has some effect on clustering tasks. This chapter investigates the appropriate data treatment methods for varying missing data scarcity distributions including gamma, Gaussian, and beta distributions. The analyzed data imputation methods include mean, hot-deck, regression, k-nearest neighbor, expectation maximization, and multiple imputation. To reveal the proper methods to deal with missing d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 29 publications
0
1
0
Order By: Relevance
“…Frequently, some individuals showed up to five missing values, so listwise case exclusion would have resulted in a sample of n = 794. For this reason, k -nearest neighbor imputation ( k = 10) of the missing data was performed, which was used in similar contexts and was able to provide good results for different distributions of missing data (Liao et al, 2014 ; Cleophas and Zwinderman, 2016 ; Aktaş et al, 2019 ). In k -nearest neighbor imputation, the mean values of the neighboring values are formed and assigned to the missing value (von der Hude, 2020 ).…”
Section: Resultsmentioning
confidence: 99%
“…Frequently, some individuals showed up to five missing values, so listwise case exclusion would have resulted in a sample of n = 794. For this reason, k -nearest neighbor imputation ( k = 10) of the missing data was performed, which was used in similar contexts and was able to provide good results for different distributions of missing data (Liao et al, 2014 ; Cleophas and Zwinderman, 2016 ; Aktaş et al, 2019 ). In k -nearest neighbor imputation, the mean values of the neighboring values are formed and assigned to the missing value (von der Hude, 2020 ).…”
Section: Resultsmentioning
confidence: 99%