2009
DOI: 10.1007/978-3-642-03730-6_29
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Clustering-Based Estimation of Missing Values in Mixed Type Data

Abstract: Abstract. The appropriate choice of a method for imputation of missing data becomes especially important when the fraction of missing values is large and the data are of mixed type. The proposed dynamic clustering imputation (DCI) algorithm relies on similarity information from shared neighbors, where mixed type variables are considered together. When evaluated on a public social science dataset of 46,043 mixed type instances with up to 33% missing values, DCI resulted in more than 20% improved imputation accu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 15 publications
0
4
0
Order By: Relevance
“…These results seem intuitive since in principle multiple imputation works better when the proportion of MVs is smaller, in which case more data are available for validating the estimates inferred. These results assume particular relevance, if we consider that the appropriate choice of the method for handling MVs is especially important when the fraction of MVs is large [18].…”
Section: Benchmarks Resultsmentioning
confidence: 93%
See 3 more Smart Citations
“…These results seem intuitive since in principle multiple imputation works better when the proportion of MVs is smaller, in which case more data are available for validating the estimates inferred. These results assume particular relevance, if we consider that the appropriate choice of the method for handling MVs is especially important when the fraction of MVs is large [18].…”
Section: Benchmarks Resultsmentioning
confidence: 93%
“…Since many algorithms cannot directly handle MVs, a common practice is to rely on data pre-processing techniques. Usually, this is accomplished by using imputation or simply by removing instances (case deletion) and/or features containing MVs [5,3,11,12,17,18,1]. A review of the methods and techniques to deal with this problem, including a comparison of some well-known approaches, can be found in Laencina et al [5].…”
Section: Methods For Handling Mvs In Machine Learningmentioning
confidence: 99%
See 2 more Smart Citations