Proceedings of the 22nd ACM International Conference on Information &Amp; Knowledge Management 2013
DOI: 10.1145/2505515.2505527
|View full text |Cite
|
Sign up to set email alerts
|

Efficient hierarchical clustering of large high dimensional datasets

Abstract: Hierarchical clustering is extensively used to organize high dimensional objects such as documents and images into a structure which can then be used in a multitude of ways. However, existing algorithms are limited in their application since the time complexity of agglomerative style algorithms can be as much as O(n 2 log n) where n is the number of objects. Furthermore the computation of similarity between such objects is itself time consuming given they are high dimension and even optimized built in function… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 20 publications
(18 citation statements)
references
References 16 publications
0
18
0
Order By: Relevance
“…In the work by Gilpin et al [8], points that fall within the same cone after quantization cannot be distinguished. Similarly, the work by Patra et al [18] altogether avoids clustering close points by putting them in one cluster connected to a 'leader'.…”
Section: Correctnessmentioning
confidence: 98%
See 2 more Smart Citations
“…In the work by Gilpin et al [8], points that fall within the same cone after quantization cannot be distinguished. Similarly, the work by Patra et al [18] altogether avoids clustering close points by putting them in one cluster connected to a 'leader'.…”
Section: Correctnessmentioning
confidence: 98%
“…An algorithm with a similar aim as ours can be found from [8]. The authors proposed a linear time and space complexity algorithm for hierarchical clustering, based on quantization.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The agglomerative algorithm is know by many names such as Globally Closest Pair (GCP) clustering [4], Sequential Agglomerative Hierarchical Non-overlapping (SAHN) clustering [5], [6], or the term which we will adopt Agglomerative Hierarchical Clustering (AHC) [1], [3], [7]. As mentioned, the algorithm works by repeatedly merging two clusters together into a larger one.…”
Section: Background: Hierarchical Clusteringmentioning
confidence: 99%
“…Other works exploited some form of quantization of the space. One example is the work by Gilpin et al [7] who proposed the use of angular quantization for approximate AHC. Several authors have exploited locality-sensitive hashing [2] to speed up the clustering task, for example Koga et al [11] for single linkage and Cochez and Mou [1] for average linkage.…”
Section: Background: Hierarchical Clusteringmentioning
confidence: 99%