2021
DOI: 10.48550/arxiv.2106.04727
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ParChain: A Framework for Parallel Hierarchical Agglomerative Clustering using Nearest-Neighbor Chain

Abstract: This paper studies the hierarchical clustering problem, where the goal is to produce a dendrogram that represents clusters at varying scales of a data set. We propose the ParChain framework for designing parallel hierarchical agglomerative clustering (HAC) algorithms, and using the framework we obtain novel parallel algorithms for the complete linkage, average linkage, and Ward's linkage criteria. Compared to most previous parallel HAC algorithms, which require quadratic memory, our new algorithms require only… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 54 publications
0
2
0
Order By: Relevance
“…Suppose the distance of the furthest point pair between 𝐢 𝑖 and 𝐡 is Ξ” comp (𝐢 𝑖 , 𝐡) = 𝑑 (𝑝, π‘ž). Since the average Euclidean distance between points in two clusters is not smaller than the distance between their centroids (shown in the full paper [69]), applying this property to 𝐢 𝑖 and {π‘ž } for any point π‘ž ∈ 𝐡, we see there must exists some 𝑝 ∈ 𝐢 𝑖 such that 𝑑 ( x𝐢 𝑖 , π‘ž ) ≀ 𝑑 (𝑝 , π‘ž ). Since (𝑝, π‘ž) is the furthest point pair, we have that 𝑑 ( x𝐢 𝑖 , π‘ž ) ≀ 𝑑 (𝑝 , π‘ž ) ≀ 𝑑 (𝑝, π‘ž).…”
Section: Complete Linkagementioning
confidence: 97%
See 1 more Smart Citation
“…Suppose the distance of the furthest point pair between 𝐢 𝑖 and 𝐡 is Ξ” comp (𝐢 𝑖 , 𝐡) = 𝑑 (𝑝, π‘ž). Since the average Euclidean distance between points in two clusters is not smaller than the distance between their centroids (shown in the full paper [69]), applying this property to 𝐢 𝑖 and {π‘ž } for any point π‘ž ∈ 𝐡, we see there must exists some 𝑝 ∈ 𝐢 𝑖 such that 𝑑 ( x𝐢 𝑖 , π‘ž ) ≀ 𝑑 (𝑝 , π‘ž ). Since (𝑝, π‘ž) is the furthest point pair, we have that 𝑑 ( x𝐢 𝑖 , π‘ž ) ≀ 𝑑 (𝑝 , π‘ž ) ≀ 𝑑 (𝑝, π‘ž).…”
Section: Complete Linkagementioning
confidence: 97%
“…The nearest neighbor 𝐡 of 𝐢 𝑖 must have its centroid inside the ball, i.e., x𝐢 𝑖 βˆ’ x𝐡 ≀ Ξ” avg-1 (𝐢 𝑖 , 𝐡). The proof is provided in the full version of our paper [69].…”
Section: Average Linkagementioning
confidence: 99%