2017
DOI: 10.48550/arxiv.1703.02293
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Variable selection for mixed data clustering: a model-based approach

Abstract: We propose two approaches for selecting variables in latent class analysis (i.e., mixture model assuming within component independence), which is the common model-based clustering method for mixed data. The first approach consists in optimizing the BIC with a modified version of the EM algorithm. This approach simultaneously performs both model selection and parameter inference. The second approach consists in maximizing the MICL, which considers the clustering task, with an algorithm of alternate optimization… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 35 publications
0
6
0
Order By: Relevance
“…The authors suggest the use of slope heuristics (see Baudry, Maugis and Michel, 2012, for an overview) to calibrate the value of κ. Marbac and Sedki (2017c) extend the approach of Marbac and Sedki (2017a) (see Section 4.3) to clustering and variable selection of data of mixed type, with data on only categorical variables as a particular case. The method is based again on the MICL criterion and does not require multiple calls of the EM algorithm.…”
Section: Model Selection Approachesmentioning
confidence: 97%
See 3 more Smart Citations
“…The authors suggest the use of slope heuristics (see Baudry, Maugis and Michel, 2012, for an overview) to calibrate the value of κ. Marbac and Sedki (2017c) extend the approach of Marbac and Sedki (2017a) (see Section 4.3) to clustering and variable selection of data of mixed type, with data on only categorical variables as a particular case. The method is based again on the MICL criterion and does not require multiple calls of the EM algorithm.…”
Section: Model Selection Approachesmentioning
confidence: 97%
“…The R packages for variable selection for latent class analysis are ClustMMDD (Toussile, 2016), LCAvarsel and VarSelLCM (Marbac and Sedki, 2017b). In particular, we note that VarSelLCM implements a more general framework for clustering and variable selection of data of mixed type (Marbac and Sedki, 2017c). Table 4 lists the packages, with information regarding the type and the method; all implement a model selection approach.…”
Section: R Packages and Data Examplementioning
confidence: 99%
See 2 more Smart Citations
“…In order to deal with large numbers of variables, we propose an extension of the approaches proposed by Marbac and Sedki (2017b) and Marbac and Sedki (2017a), in the framework of variable selection in clustering. The main idea is to use a more constrained model to be able to easily perform model selection.…”
Section: Introductionmentioning
confidence: 99%