Biocomputing 2002 2001
DOI: 10.1142/9789812799623_0002
|View full text |Cite
|
Sign up to set email alerts
|

A stability based method for discovering structure in clustered data

Abstract: We present a method for visually and quantitatively assessing the presence of structure in clustered data. The method exploits measurements of the stability of clustering solutions obtained by perturbing the data set. Stability is characterized by the distribution of pairwise similarities between clusterings obtained from sub samples of the data. High pairwise similarities indicate a stable clustering pattern. The method can be used with any clustering algorithm; it provides a means of rationally defining an o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
345
0

Year Published

2006
2006
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 299 publications
(347 citation statements)
references
References 13 publications
2
345
0
Order By: Relevance
“…In spite of its straightforwardness, the proposed measure has revealed useful for analyzing the structure of patients clusters, as shown by our experiments. Nevertheless, if the main goal is to estimate the "natural" or "optimal" number of clusters we suggest to use also other more principled global measures based on distribution of some property of the data, such as measures based on distribution of pairwise similarity between clusterings of subsamples of a dataset [30].…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…In spite of its straightforwardness, the proposed measure has revealed useful for analyzing the structure of patients clusters, as shown by our experiments. Nevertheless, if the main goal is to estimate the "natural" or "optimal" number of clusters we suggest to use also other more principled global measures based on distribution of some property of the data, such as measures based on distribution of pairwise similarity between clusterings of subsamples of a dataset [30].…”
Section: Discussionmentioning
confidence: 99%
“…The Smolkin and Gosh method based on random subspace does not provide a technique to estimate the number of clusters. As suggested by the authors, the model explorer algorithm [30] has been applied to estimate the correct number of cluster. The model explorer algorithm is specifically designed to estimate only the number of cluster (no estimation of the reliability of each individual cluster is provided) and it exploits the overall distribution of the similarity measures to asses the stability of the clustering.…”
Section: Experimental Comparison With Other Stability-based Methodsmentioning
confidence: 99%
See 3 more Smart Citations