2018
DOI: 10.51936/lxut1974
|View full text |Cite
|
Sign up to set email alerts
|

Internal evaluation criteria for categorical data in hierarchical clustering

Abstract: The paper compares 11 internal evaluation criteria for hierarchical clustering of categorical data regarding a correct number of clusters determination. The criteria are divided into three groups based on a way of treating the cluster quality. The variability-based criteria use the within-cluster variability, the likelihood-based criteria maximize the likelihood function, and the distance-based criteria use distances within and between clusters. The aim is to determine which evaluation criteria perform well an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…However, we included many items that were overlapping or similar (for example, 'performs actions' and 'performs certain actions') to ensure that the potential content space of robot characteristics was sampled in detail (for the final list of 277 characteristics see Supplementary Table 3). The characteristics, as sorted into categories by Sample 2 participants, were subjected to hierarchical cluster analysis for categorical data [114][115][116] : a dissimilarity matrix was computed using Gower's distance 198,199 , clusters were produced using Ward's linkage method 200,201 and the optimal number of clusters was determined via the mean silhouette width approach using the partitioning around medoids algorithm 114,202,203 . The five clusters that emerged were then arranged into the robot definition (Table 2).…”
Section: Methodsmentioning
confidence: 99%
“…However, we included many items that were overlapping or similar (for example, 'performs actions' and 'performs certain actions') to ensure that the potential content space of robot characteristics was sampled in detail (for the final list of 277 characteristics see Supplementary Table 3). The characteristics, as sorted into categories by Sample 2 participants, were subjected to hierarchical cluster analysis for categorical data [114][115][116] : a dissimilarity matrix was computed using Gower's distance 198,199 , clusters were produced using Ward's linkage method 200,201 and the optimal number of clusters was determined via the mean silhouette width approach using the partitioning around medoids algorithm 114,202,203 . The five clusters that emerged were then arranged into the robot definition (Table 2).…”
Section: Methodsmentioning
confidence: 99%