2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06) 2006
DOI: 10.1109/focs.2006.75
|View full text |Cite
|
Sign up to set email alerts
|

The Effectiveness of Lloyd-Type Methods for the k-Means Problem

Abstract: We investigate variants of Lloyd's heuristic for clustering high dimensional data in an attempt to explain its popularity (a half century after its introduction) among practitioners, and in order to suggest improvements in its application. We propose and justify a clusterability criterion for data sets. We present variants of Lloyd's heuristic that quickly lead to provably near-optimal clustering solutions when applied to well-clusterable instances. This is the first performance guarantee for a variant of Lloy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
294
1

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 251 publications
(296 citation statements)
references
References 33 publications
1
294
1
Order By: Relevance
“…This considerably decreases the flop count of algorithms that try to minimize the above expression, as there are many fewer terms involved if K M. There are many algorithms that directly or indirectly try to minimize the above expression over the K columns Y j . However it is difficult to the find the global minimum and the quality of the local minimum may not be good, though there does not necessarily seem to be agreement over this in the literature, as the precise local minima at which the algorithm stops depends on the starting point [13,14,18].…”
Section: Introductionmentioning
confidence: 91%
“…This considerably decreases the flop count of algorithms that try to minimize the above expression, as there are many fewer terms involved if K M. There are many algorithms that directly or indirectly try to minimize the above expression over the K columns Y j . However it is difficult to the find the global minimum and the quality of the local minimum may not be good, though there does not necessarily seem to be agreement over this in the literature, as the precise local minima at which the algorithm stops depends on the starting point [13,14,18].…”
Section: Introductionmentioning
confidence: 91%
“…Different techniques have been proposed in the literature for choosing those initial representatives (also called seeds or seeding prototypes) [1,16,30]. In the following, we present some initialization techniques that are considered in the experimental part of this paper as reference systems for performance evaluations.…”
Section: K-means Initialization Techniquesmentioning
confidence: 99%
“…It is worth stressing that GD is originally conceived to operate in R n [30]. Therefore, for the purpose of this paper, in the experiment we will adapt the algorithm by idealizing the Voronoi regions by means of the MinSOD cluster representation (4).…”
Section: Pattern Anal Applicmentioning
confidence: 99%
See 2 more Smart Citations