2004
DOI: 10.1016/j.comgeo.2004.03.003
|View full text |Cite
|
Sign up to set email alerts
|

A local search approximation algorithm for k-means clustering

Abstract: In k-means clustering we are given a set of n data points in d-dimensional space d and an integer k, and the problem is to determine a set of k points in d , called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known for this problem. Although asymptotically efficient approximation algorithms exist, these algorithms are not practical due to the very high constant factors involved. There are many heuristics that are used in pra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
322
0
4

Year Published

2005
2005
2022
2022

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 374 publications
(329 citation statements)
references
References 24 publications
3
322
0
4
Order By: Relevance
“…In k-means clustering, we are given a set of n data points in d-dimensional space, R d and an integer k. The problem is to determine a set of k points m j , j=1,2,3,….k , in R d , called centers, to minimize the mean squared distance from each data point to its nearest center [30]. The objective function is:…”
Section: Algorithm Descriptionmentioning
confidence: 99%
“…In k-means clustering, we are given a set of n data points in d-dimensional space, R d and an integer k. The problem is to determine a set of k points m j , j=1,2,3,….k , in R d , called centers, to minimize the mean squared distance from each data point to its nearest center [30]. The objective function is:…”
Section: Algorithm Descriptionmentioning
confidence: 99%
“…There are polynomial time algorithms that compute a constant factor approximation to the optimal solution; see for instance the local search algorithm analyzed by Kanungo et al [13]. If k, the number of centers, is a fixed constant, then the problem admits polynomial-time approximation schemes [8,14].…”
Section: Introductionmentioning
confidence: 99%
“…This formulation is the objective of the popular K-means algorithm (see, for example, [9]), The optimal biclustering of X consists of {R * 1 , R * 2 } = {{1}, {2, 3, 4}} row clusters and {C * 1 , C * 2 , C * 3 } = {{b, f }, {a, d, e}, {c}} column clusters when using L 1 -norm. (c) Biclusters of the data matrix returned by our scheme, that is, using twice an optimal one-way clustering algorithm, once on the 4 row vectors and another on the 6 column vectors, with L 1 -norm.…”
Section: Introductionmentioning
confidence: 99%