2008
DOI: 10.3233/ida-2008-12602
|View full text |Cite
|
Sign up to set email alerts
|

A comprehensive validity index for clustering

Abstract: Cluster validity indices are used for both estimating the quality of a clustering algorithm and for determining the correct number of clusters in data. Even though several indices exist in the literature, most of them are only relevant for data sets that contain at least two clusters. This paper introduces a new bounded index for cluster validity called the score function (SF), a double exponential expression that is based on a ratio of standard cluster parameters. Several artificial and real-life data sets ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(29 citation statements)
references
References 44 publications
0
29
0
Order By: Relevance
“…where k represents the number of clusters, the total number of points within the dataset, the error sum-of-squares between inter-clusters, and the squared intra-cluster differences. The relationships for and are defined by the equations [19] = ∑ · ( ,…”
Section: Validation Methodsmentioning
confidence: 99%
“…where k represents the number of clusters, the total number of points within the dataset, the error sum-of-squares between inter-clusters, and the squared intra-cluster differences. The relationships for and are defined by the equations [19] = ∑ · ( ,…”
Section: Validation Methodsmentioning
confidence: 99%
“…To obtain well-separated and compact clusters, SS B is maximized and SS W minimized. Therefore, the maximum value for CH indicates a suitable partition for the data set [38].…”
Section: • Optimal Clustering Determinationmentioning
confidence: 99%
“…, x n ) where each point is a d-dimensional vector, clustering can be carried out using K-means. This algorithm, already employed in (Saitta et al 2008b) to explain structural identification outcomes, aims to find a set of K clusters…”
Section: K-meansmentioning
confidence: 99%
“…Other data mining techniques such as K-means clustering have already been employed to extract knowledge from a set of candidate models (Saitta et al 2005b). Although K-means requires that the number of clusters is given as input, methods are available to determine reasonable values for this parameter (MacQueen 1967;Pelleg and Moore 2000;Saitta et al 2008b). Previous research involving clustering of candidate models mainly focused on reducing the number of clusters in the CMS by iteratively adding new sensor locations.…”
Section: Introductionmentioning
confidence: 99%