2013
DOI: 10.1109/tsmcb.2012.2220543
|View full text |Cite
|
Sign up to set email alerts
|

Understanding and Enhancement of Internal Clustering Validation Measures

Abstract: Clustering validation has long been recognized as one of the vital issues essential to the success of clustering applications. In general, clustering validation can be categorized into two classes, external clustering validation and internal clustering validation. In this paper, we focus on internal clustering validation and present a study of 11 widely used internal clustering validation measures for crisp clustering. The results of this study indicate that these existing measures have certain limitations in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
38
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 225 publications
(42 citation statements)
references
References 34 publications
0
38
0
Order By: Relevance
“…This gave rise to a comprehensive pool of clustering that contained 2400 different clustering results on the same dataset. Each clustering result was then quality evaluated based on the 11 clustering validity indices (Liu et al., 2013) in machine learning. These indices were finally used as weights to combine 2400 clustering results: results with higher validity indexes were given more priority in the final clustering result, and the cluster number was selected from results of the highest validity indices.…”
Section: Methodsmentioning
confidence: 99%
“…This gave rise to a comprehensive pool of clustering that contained 2400 different clustering results on the same dataset. Each clustering result was then quality evaluated based on the 11 clustering validity indices (Liu et al., 2013) in machine learning. These indices were finally used as weights to combine 2400 clustering results: results with higher validity indexes were given more priority in the final clustering result, and the cluster number was selected from results of the highest validity indices.…”
Section: Methodsmentioning
confidence: 99%
“…After performing the DBSCAN algorithm, the effectiveness of UC identification was evaluated [49][50][51]. Consistent with the study conducted by Liu, Li, Xiong, Gao, Wu, and Wu [51], CVNN was employed to compare the different versions of UC identification with diverse k-values. Typically, the best version of UC identification has the lowest CVNN value.…”
Section: Data Calculation Layermentioning
confidence: 99%
“…Since the number of possible partitions is high-e.g., for n = 20, p is 524.287-it is recommended to reduce the number by selecting a number of groups as appropriate. The Calinski-Harabasz index, CH [48,49], identifies the best subdivision, i.e., the one that maximises the external heterogeneity between the data groups and minimises the internal one.…”
Section: Submarkets By Clustermentioning
confidence: 99%