Proceedings of the 2008 SIAM International Conference on Data Mining 2008
DOI: 10.1137/1.9781611972788.71
|View full text |Cite
|
Sign up to set email alerts
|

Cluster Ensemble Selection

Abstract: This paper studies the ensemble selection problem for unsupervised learning. Given a large library of different clustering solutions, our goal is to select a subset of solutions to form a smaller but better performing cluster ensemble than using all available solutions. We design our ensemble selection methods based on quality and diversity, the two factors that have been shown to influence cluster ensemble performance. Our investigation revealed that using quality or diversity alone may not consistently achie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
114
0
1

Year Published

2010
2010
2018
2018

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(115 citation statements)
references
References 13 publications
0
114
0
1
Order By: Relevance
“…Future work is to increase the number of clusterers in the ensemble and investigate ensemble selection approaches (Fern and Lin, 2008) in order to avoid the potential degradation of the ensemble performance if a significant number of "bad" clusterers inappropriate for a dataset are present among the ensemble components.…”
Section: Discussionmentioning
confidence: 99%
“…Future work is to increase the number of clusterers in the ensemble and investigate ensemble selection approaches (Fern and Lin, 2008) in order to avoid the potential degradation of the ensemble performance if a significant number of "bad" clusterers inappropriate for a dataset are present among the ensemble components.…”
Section: Discussionmentioning
confidence: 99%
“…One practical advantage is that if two different hard clustering algorithms are applied to the same data set, results are mostly different. It is considered to be very hard to find an optimal way to combine these different clusterings [3,5]. The basic reason for the difficulty of combining different clustering results is the inconsistency between the clusterings, more precisely the fact that while a data element belongs to one cluster according to the first algorithm, it belongs to another cluster according to the second algorithm.…”
Section: Generalized Hard Cluster Analysismentioning
confidence: 99%
“…Oza & Tumer (2008) do the same in a more recent work, in which they present real applications, where using classifier ensembles has been obtaining a greater success in comparison to using individual classifiers, including remote sensoring, medicine and pattern recognition. Fern (2008) analyses how to combine several available solutions to create a more effective cluster ensemble, based on two critical factors in the performance of a cluster ensemble: quality and diversity of solutions. Leisch (1998), one of the pioneers in the branch of cluster ensembles, introduced an algorithm named bagged clustering, which performs several instances of K-means algorithm, in the attempt of obtaining a certain stability in the results and combines partial results through a hierarchical partitioning method.…”
Section: Classification and Cluster Ensemblementioning
confidence: 99%