2023
DOI: 10.31449/inf.v47i6.4445
|View full text |Cite
|
Sign up to set email alerts
|

Big Data Clustering Techniques Challenged and Perspectives: Review

Abstract: Clustering in big data is considered a critical data mining and analysis technique. There are issues with adapting clustering algorithms to large amounts of data and new challenges brought by big data. As the size of big data is up to petabytes of data, and clustering methods have high processing costs, the challenge is how to handle this issue and utilize clustering techniques for big data efficiently. This study aims to investigate the recent advancement of clustering platforms and techniques to handle big d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 60 publications
0
3
0
Order By: Relevance
“…Notably, using only the WoS database is considered sufficient for retrieving clustering-related articles. Additionally, many review papers rely on the official website [17][18][19]24] or the publisher's website [20][21][22] as their primary source. In another study, Wang et al [23] retrieved articles indexed by Google Scholar and the WoS database.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Notably, using only the WoS database is considered sufficient for retrieving clustering-related articles. Additionally, many review papers rely on the official website [17][18][19]24] or the publisher's website [20][21][22] as their primary source. In another study, Wang et al [23] retrieved articles indexed by Google Scholar and the WoS database.…”
Section: Methodsmentioning
confidence: 99%
“…Furthermore, various other taxonomies and surveys exist, covering topics that focus on determining the cluster number [19], machine-learning-based clustering [20], big data clustering [21,22], density peak clustering [23], subspace clustering for high-dimensional data [24], automatic clustering [25], and how nature-inspired metaheuristic techniques are implemented into automatic clustering [26].…”
Section: Introductionmentioning
confidence: 99%
“…In this case, CA will manage Task Scheduling by utilizing a cost function that aims to minimize both the make span and the overall time of the tasks. The make span is determined through the following equations [24,27]:…”
Section: Proposed Workmentioning
confidence: 99%