2015 IEEE International Congress on Big Data 2015
DOI: 10.1109/bigdatacongress.2015.72
|View full text |Cite
|
Sign up to set email alerts
|

k-Means Performance Improvements with Centroid Calculation Heuristics Both for Serial and Parallel Environments

Abstract: k-means is the most widely used clustering algorithm due to its fairly straightforward implementations in various problems. Meanwhile, when the number of clusters increase, the number of iterations also tend to slightly increase. However there are still opportunities for improvement as some studies in the literature indicate. In this study, improved implementations of k-means algorithm with a centroid calculation heuristics which results in a performance improvement over traditional k-means are proposed. Two d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…We used IBM Ilog Cplex for the numerical solutions of the optimization problems. Data mining, analysis, and user clustering for the data sets have been performed in our previous publications Karimov et al (2015a); Karimov and Ozbayoglu (2015), hence, for details of data analysis we refer the reader to Karimov et al (2015a); Karimov and Ozbayoglu (2015) and focus on the performance evaluations of the optimization of the menu structure.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We used IBM Ilog Cplex for the numerical solutions of the optimization problems. Data mining, analysis, and user clustering for the data sets have been performed in our previous publications Karimov et al (2015a); Karimov and Ozbayoglu (2015), hence, for details of data analysis we refer the reader to Karimov et al (2015a); Karimov and Ozbayoglu (2015) and focus on the performance evaluations of the optimization of the menu structure.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, it is possible that a certain menu item under another menu item can have a higher click count than its parent menu item. Note that mining, clustering, analysis, and classification of ATM usage data has been performed in our earlier studies (Karimov and Ozbayoglu,2015;Karimov et al,2015a), hence, in this paper we do not go into the details of data processing and assume that the user logs are mined in an efficient way (i.e., they are clustered and click counts of menu items for each cluster is known). -Original graph representation of the ATM menu to be optimized is readily available (i.e., actual menu structure in use by the financial establishment serving the customers).…”
Section: Problem Definitionmentioning
confidence: 99%
See 1 more Smart Citation
“…The clustering approach was based on a combination of Fireworks and Cuckoo-search algorithms with representative points being selected as the centroids. In [117], the authors also used a centroid calculation heuristics to help enhance the clustering performance as the number of clusters increases.…”
Section: Big Data Clusteringmentioning
confidence: 99%