2020
DOI: 10.1109/access.2020.2999085
|View full text |Cite
|
Sign up to set email alerts
|

Mapreduce-Based Distributed Clustering Method Using CF+ Tree

Abstract: Clustering exceptionally large data sets is becoming a major challenge in data analytics with the continuous increase in their size. Summary-based clustering methods and distributed computing frameworks such as MapReduce can efficiently handle this challenge. These methods include BIRCH and its extension CF +-ERC. CF +-ERC can reduce the clustering time of large data sets by utilizing the structure of a CF + tree. However, CF +-ERC is a sequential clustering method, so it cannot be used with multiple machines … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…The first one is distributed clustering based on MapReduce [11]- [15]. Its main idea is to adaptively improve classical clustering algorithms using MapReduce, and complete the calculation in parallel.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The first one is distributed clustering based on MapReduce [11]- [15]. Its main idea is to adaptively improve classical clustering algorithms using MapReduce, and complete the calculation in parallel.…”
Section: Related Workmentioning
confidence: 99%
“…As a comparison object, we choose MapReduce-Based Density Clustering (MBDC) [11][30] as a benchmark. To facilitate the experiments, we improve MBDC algorithm as follows.…”
Section: Fig 8 Randomly Generated Clustering Objectsmentioning
confidence: 99%