Özüdoğru, Ahmet Alper scite author profile

The famous k-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] is the most popular way of solving the k-means problem in practice. The algorithm is very simple: it samples the first center uniformly at random and each of the following k − 1 centers is then always sampled proportional to its squared distance to the closest center so far. Afterward, Lloyd's iterative algorithm is run. The k-means++ algorithm is known to return a Θ(log k) approximate solution in expectation.In their seminal work, Arthur and Vassilvitskii [SODA 2007] asked about the guarantees for its following greedy variant: in every step, we sample candidate centers instead of one and then pick the one that minimizes the new cost. This is also how k-means++ is implemented in e.g. the popular Scikit-learn library [Pedregosa et al.; JMLR 2011].We present nearly matching lower and upper bounds for the greedy k-means++: We prove that it is an O( 3 log 3 k)-approximation algorithm. On the other hand, we prove a lower bound of Ω( 3 log 3 k/ log 2 ( log k)). Previously, only an Ω( log k) lower bound was known [Bhattacharya, Eube, Röglin, Schmidt; ESA 2020] and there was no known upper bound.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Özüdoğru, Ahmet Alper

A Nearly Tight Analysis of Greedy k-means++

A Nearly Tight Analysis of Greedy k-means++

Contact Info

Product

Resources

About