Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006
DOI: 10.1145/1137856.1137880
|View full text |Cite
|
Sign up to set email alerts
|

How slow is the k -means method?

Abstract: The k-means method is an old but popular clustering algorithm known for its observed speed and its simplicity. Until recently, however, no meaningful theoretical bounds were known on its running time. In this paper, we demonstrate that the worst-case running time of k-means is superpolynomial by improving the best known lower bound from Ω(n) iterations to 2 Ω( √ n) .

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
452
0
6

Year Published

2006
2006
2023
2023

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 483 publications
(462 citation statements)
references
References 12 publications
4
452
0
6
Order By: Relevance
“…The optimum number of the clusters may vary based on the properties of the dataset, such as the geometric distribution, statistical measures, and neighborhood measures [42,48]. In general, though, it can be reported that the increase in the number of clusters yields higher computational costs with lower condensing ratio and may also cause higher classification accuracy.…”
Section: Batch Datasetsmentioning
confidence: 99%
“…The optimum number of the clusters may vary based on the properties of the dataset, such as the geometric distribution, statistical measures, and neighborhood measures [42,48]. In general, though, it can be reported that the increase in the number of clusters yields higher computational costs with lower condensing ratio and may also cause higher classification accuracy.…”
Section: Batch Datasetsmentioning
confidence: 99%
“…Very few theoretical guarantees are known about Lloyd's method or its variants. The convergence rate of Lloyd's method has recently been investigated in [10,22,2] and in particular, [2] shows that Lloyd's method can require a superpolynomial number of iterations to converge.…”
Section: Introductionmentioning
confidence: 99%
“…For clustering the tweets, we use the K-means algorithm [23], a very popular algorithm for clustering due its speed and simplicity [24,25]. Basically, it has a single parameter to set: k, the number of clusters to find.…”
Section: Topic Identification Based On Clusteringmentioning
confidence: 99%