2016
DOI: 10.1371/journal.pone.0161112
|View full text |Cite
|
Sign up to set email alerts
|

Performance Evaluation of Missing-Value Imputation Clustering Based on a Multivariate Gaussian Mixture Model

Abstract: BackgroundIt is challenging to deal with mixture models when missing values occur in clustering datasets.Methods and ResultsWe propose a dynamic clustering algorithm based on a multivariate Gaussian mixture model that efficiently imputes missing values to generate a “pseudo-complete” dataset. Parameters from different clusters and missing values are estimated according to the maximum likelihood implemented with an expectation-maximization algorithm, and multivariate individuals are clustered with Bayesian post… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 29 publications
0
7
0
Order By: Relevance
“…These parameters are then applied in computing the probability of each observation. The best number of distributions to fit the data is also determined by minimizing the Akaike information criterion (AIC) (Xiao et al, 2016).…”
Section: Decision-making Algorithmmentioning
confidence: 99%
“…These parameters are then applied in computing the probability of each observation. The best number of distributions to fit the data is also determined by minimizing the Akaike information criterion (AIC) (Xiao et al, 2016).…”
Section: Decision-making Algorithmmentioning
confidence: 99%
“…Therefore, methods have been proposed to perform cluster analysis in the presence of missing data (Plaehn, 2019). First, likelihood‐based clustering procedures appear particularly well suited to be adapted to the presence of missing data (Hunt & Jorgensen, 2003; Xiao et al., 2016). Several variations of the widely used K‐means clustering algorithm have also been developed (e.g., Chi, Chi, & Baraniuk, 2016's K‐POD algorithm).…”
Section: Introductionmentioning
confidence: 99%
“…The algorithm is very subjective and requires the number of clusters which are specified in advance. Many clustering algorithms have been developed, such as grid-based [ 12 ], hierarchy-based [ 13 ], model-based [ 14 ], and density-based [ 15 ] clustering algorithms. The processing time of the grid clustering algorithm is related to the number of cells divided by each dimensional space, which reduces the quality and accuracy of clustering.…”
Section: Methodsmentioning
confidence: 99%