Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2006
DOI: 10.1145/1150402.1150414
|View full text |Cite
|
Sign up to set email alerts
|

Robust information-theoretic clustering

Abstract: How do we find a natural clustering of a real world point set, which contains an unknown number of clusters with different shapes, and which may be contaminated by noise? Most clustering algorithms were designed with certain assumptions (Gaussianity), they often require the user to give input parameters, and they are sensitive to noise. In this paper, we propose a robust framework for determining a natural clustering of a given data set, based on the minimum description length (MDL) principle. The proposed fra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2008
2008
2021
2021

Publication Types

Select...
5
3
1

Relationship

4
5

Authors

Journals

citations
Cited by 37 publications
(23 citation statements)
references
References 15 publications
0
23
0
Order By: Relevance
“…The user specifies the model complexity by parameter settings, most importantly by selecting the number of clusters k. Most approaches to parameter-free clustering, e.g. X-Means [16], G-Means [12] and RIC [5] employ information-theoretic criteria to achieve a balance between the complexity of the model and its quality for interpretation. However, these approaches rely on a relatively simple cluster notion.…”
Section: Related Workmentioning
confidence: 99%
“…The user specifies the model complexity by parameter settings, most importantly by selecting the number of clusters k. Most approaches to parameter-free clustering, e.g. X-Means [16], G-Means [12] and RIC [5] employ information-theoretic criteria to achieve a balance between the complexity of the model and its quality for interpretation. However, these approaches rely on a relatively simple cluster notion.…”
Section: Related Workmentioning
confidence: 99%
“…Some information-theoretic algorithms have recently been proposed with the major focus on avoiding crucial parameter settings in clustering, e.g. [24,15,7,8]. As SONAR, these algorithms rely on the Minimum Description Length principle [13], which allows model selection by regarding clustering as a data compression problem.…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…The MDL principle was used for vector quantization (Bischof et al 1999), where superfluous vectors were detected via MDL. Böhm et al (2006) used MDL to optimise a given partitioning by choosing specific models for each of the parts. These model-classes need to be pre-defined, requiring premonition of the component models in the data.…”
Section: Related Workmentioning
confidence: 99%