2012 7th International Conference on Electrical and Computer Engineering 2012
DOI: 10.1109/icece.2012.6471633
|View full text |Cite
|
Sign up to set email alerts
|

Improvement of K-means clustering algorithm with better initial centroids based on weighted average

Abstract: Clustering is the process of grouping similar data into a set of clusters. Cluster analysis is one of the major data analysis techniques and k-means one of the most popular partitioning clustering algorithm that is widely used. But the original k-means algorithm is computationally expensive and the resulting set of clusters strongly depends on the selection of initial centroids. Several methods have been proposed to improve the performance of k-means clustering algorithm. In this paper we propose a heuristic m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(20 citation statements)
references
References 5 publications
0
20
0
Order By: Relevance
“…The set of elements between classified clusters is disjointed, and the number of elements in each cluster C i is denoted by n i . The k-means algorithm consists of two steps [29]. First, the initial centroids for each cluster are chosen randomly, then each point in the dataset is assigned to its nearest centroid by Euclidean distance [30].…”
Section: Data Mining Techniques For Ppre Analysismentioning
confidence: 99%
“…The set of elements between classified clusters is disjointed, and the number of elements in each cluster C i is denoted by n i . The k-means algorithm consists of two steps [29]. First, the initial centroids for each cluster are chosen randomly, then each point in the dataset is assigned to its nearest centroid by Euclidean distance [30].…”
Section: Data Mining Techniques For Ppre Analysismentioning
confidence: 99%
“…The sorted list of data points are then divided into k subsets. The nearest possible value of mean from each dataset becomes the initial centroids of the cluster to be constructed [13]. The Pesudocode of load based initial centriod k-means algorithm is as follows: Input: D = d1, d2.......dn // set of n data items L // set of load for data points.…”
Section: Load Based Initial Centroid K-means Algorithmmentioning
confidence: 99%
“…Also it provides improvement on the classical k-means algorithm to produce more accurate clusters. The three initialization methods explored here are K-means with weighted average method [4], Principal component analysis [5][6] and a heuristic method [7]. The novelty in the presented work comes from the involvement of distributed implementation of initialization methods using MapReduce paradigm on a totally diverse collection of data sets.…”
Section: Related Workmentioning
confidence: 99%
“…In [4] Mahmud M S et al employed a uniform method to find rank score by averaging the attribute of each data point, which generated initial centroids that follow the data distribution of the given set. A sorting algorithm is applied to the computed score and divided into "k" subsets, where k is the number of desired clusters.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation