2015
DOI: 10.1007/978-3-319-11680-8_23
|View full text |Cite
|
Sign up to set email alerts
|

Fast K-Means Clustering for Very Large Datasets Based on MapReduce Combined with a New Cutting Method

Abstract: Abstract.Clustering very large datasets is a challenging problem for data mining and processing. MapReduce is considered as a powerful programming framework which significantly reduces executing time by dividing a job into several tasks and executes them in a distributed environment. K-Means which is one of the most used clustering methods and K-Means based on MapReduce is considered as an advanced solution for very large dataset clustering. However, the executing time is still an obstacle due to the increasin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…In an attempt to reduce difficulty in segmentation task, the k-means based clustering algorithms are used in the image segmentation. K-means based clustering is one of the famous algorithms as it had been implemented by several researchers [28][29][30][31][32]. In this study, various version of k-means clustering algorithms will be implemented.…”
Section: K-means Based Clustering Image Segmentationmentioning
confidence: 99%
“…In an attempt to reduce difficulty in segmentation task, the k-means based clustering algorithms are used in the image segmentation. K-means based clustering is one of the famous algorithms as it had been implemented by several researchers [28][29][30][31][32]. In this study, various version of k-means clustering algorithms will be implemented.…”
Section: K-means Based Clustering Image Segmentationmentioning
confidence: 99%
“…Since all the features are numerical and the database is very large, the K-means clustering algorithm was used for its efficiency and convergence speed [34]. The silhouette metric [35] was used to evaluate the optimal number of clusters.…”
Section: Clustering Of Driving Stylesmentioning
confidence: 99%
“…Van Hieu D. and Meesad P. [15] has proposed an algorithm of KMeans for reducing execution time. They implemented the KMeans by cutting off the last iterations defined in K-Means.…”
Section: K-meansmentioning
confidence: 99%