2014
DOI: 10.1007/s11634-014-0177-3
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing the selection of a model-based clustering with external categorical variables

Abstract: In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which were not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a model and a number of clusters which both fit the data well and take advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 14 publications
0
11
0
Order By: Relevance
“…To deal with the problem of non-substantive classes, one might select the number of classes based on their relationship with external, substantively meaningful, variables (Baudry et al 2014). While this approach does prevent the modeling of non-substantive classes, the nuisance local dependence within the substantive classes remains.…”
Section: Introductionmentioning
confidence: 99%
“…To deal with the problem of non-substantive classes, one might select the number of classes based on their relationship with external, substantively meaningful, variables (Baudry et al 2014). While this approach does prevent the modeling of non-substantive classes, the nuisance local dependence within the substantive classes remains.…”
Section: Introductionmentioning
confidence: 99%
“…a gene is either annotated or unannotated), it may seem natural to directly use the SICL defined in Equation (5). However, in contrast to the situation considered by Baudry et al (2014), gene annotation information is often incomplete. More precisely, for each of the G annotation terms, indexed by g, the available information u g is as follows:…”
Section: Taking Genome Annotations Into Accountmentioning
confidence: 96%
“…More recently, Baudry et al (2014) proposed an ICL-like criterion that takes advantage of the potential explicative ability of external categorical variables u = (u 1 , …, u R ) where = 1 r i u indicates that the gene i is in category ℓ for the r th external categorical variable and 0 otherwise. The idea is to choose a classification z based on y that is coherent with u.…”
Section: Model-based Clustering and Model Selectionmentioning
confidence: 99%
“…Although a variety of finite mixture models has been extensively studied and developed in the literature (Baudry et al 2015;Subedi and McNicholas 2014;Lee and McLachlan 2013;Morris et al 2013), the Gaussian case has received a special attention (Morlini 2012;Nguyen and McLachlan 2015;Hennig 2010;Scrucca and Raftery 2015).…”
Section: Mixture Of Gaussian Modelsmentioning
confidence: 99%
“…However, since the number of components for each chunklet is not specified a priori, it is preferable to use an algorithm which detects the number of clusters automatically. Therefore, we combine our method with cross-entropy clustering (CEC) (Tabor and Spurek 2014;Spurek et al 2013;Tabor and Misztal 2013;Śmieja andTabor 2013, 2015b) which can be seen as a model-based clustering (McLachlan and Krishnan 2008;Morlini 2012;Subedi and McNicholas 2014;Baudry et al 2015) and determines the final number of groups.…”
Section: Introductionmentioning
confidence: 99%