Local Search Yields Approximation Schemes for k-Means and k-Median in Euclidean and Minor-Free Metrics

Cohen-Addad, Vincent; Klein, Philip N.; Mathieu, Claire

doi:10.1109/focs.2016.46

Cited by 69 publications

(88 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Choosing C as the output of a poly-time constant factor approximation algorithm for k-means, e.g. [KMN + 04, ANFSW17], or since d = 2, a PTAS for k-means [FRS16,CAKM16], gives the competitive ratio bound O(log k log log 2 (2k)).…”

Section: Building a Decision Treementioning

confidence: 99%

Near-Optimal Explainable $k$-Means for All Dimensions

Charikar¹,

Hu²

2021

Preprint

View full text Add to dashboard Cite

Many clustering algorithms are guided by certain cost functions such as the widely-used kmeans cost. These algorithms divide data points into clusters with often complicated boundaries, creating difficulties in explaining the clustering decision. In a recent work, Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML'20) introduced explainable clustering, where the cluster boundaries are axis-parallel hyperplanes and the clustering is obtained by applying a decision tree to the data. The central question here is: how much does the explainability constraint increase the value of the cost function?Given d-dimensional data points, we show an efficient algorithm that finds an explainable clustering whose k-means cost is at most k 1−2/d poly(d log k) times the minimum cost achievable by a clustering without the explainability constraint, assuming k, d ≥ 2. Combining this with an independent work by Makarychev and Shan (ICML'21), we get an improved bound of k 1−2/d polylog(k), which we show is optimal for every choice of k, d ≥ 2 up to a poly-logarithmic factor in k. For d = 2 in particular, we show an O(log k log log k) bound, improving exponentially over the previous best bound of O(k).

show abstract

Section: Building a Decision Treementioning

confidence: 99%

Near-Optimal Explainable $k$-Means for All Dimensions

Charikar¹,

Hu²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…In terms of their computational complexity, these problems are hard to approximate within a factor better than 1.1 in high-dimensional Euclidean spaces and admits approximation schemes in low-dimension [1,17,7]. On the other hand, they admit constant-factor approximation algorithms for high-dimensional Euclidean spaces, better than for general metric spaces [9]. Due to hardness results, constant-factor approximation factors are not achievable for the explainable clustering formulation.…”

Section: Other Related Workmentioning

confidence: 99%

Almost Tight Approximation Algorithms for Explainable Clustering

Esfandiari¹,

Mirrokni²,

Narayanan³

2021

Preprint

View full text Add to dashboard Cite

Recently, due to an increasing interest for transparency in artificial intelligence, several methods of explainable machine learning have been developed with the simultaneous goal of accuracy and interpretability by humans. In this paper, we study a recent framework of explainable clustering first suggested by Dasgupta et al. [11]. Specifically, we focus on the k-means and k-medians problems and provide nearly tight upper and lower bounds. First, we provide an O(log k log log k)-approximation algorithm for explainable k-medians, improving on the best known algorithm of O(k) [11] and nearly matching the known Ω(log k) lower bound [11]. In addition, in low-dimensional spaces d ≪ log k, we show that our algorithm also provides an O(d log 2 d)-approximate solution for explainable k-medians. This improves over the best known bound of O(d log k) for low dimensions [19], and is a constant for constant dimensional spaces. To complement this, we show a nearly matching Ω(d) lower bound. Next, we study the k-means problem in this context and provide an O(k log k)-approximation algorithm for explainable k-means, improving over the O(k 2 ) bound of Dasgupta et al. and the O(dk log k) bound of [19]. To complement this we provide an almost tight Ω(k) lower bound, improving over the Ω(log k) lower bound of Dasgupta et al. Given an approximate solution to the classic kmeans and k-medians, our algorithm for k-medians runs in time O(kd log 2 k) and our algorithm for k-means runs in time O(k 2 d).

show abstract

“…K-means is an unsupervised learning algorithm, which means the data to be processed has no labels [31]. It divides data points into k clusters so that each point belongs to the cluster corresponding to his nearest mean [32].…”

Section: Hitl Based K-means Clustering For Ev Driver Behaviormentioning

confidence: 99%

EV Charging Behavior Analysis Using Hybrid Intelligence for 5G Smart Grid

Shen

Fang

et al. 2020

Electronics

View full text Add to dashboard Cite

With the development of the Internet of Things (IoT) and the widespread use of electric vehicles (EV), vehicle-to-grid (V2G) has sparked considerable discussion as an energy-management technology. Due to the inherently high maneuverability of EVs, V2G systems must provide on-demand service for EVs. Therefore, in this work, we propose a hybrid computing architecture based on fog and cloud with applications in 5G-based V2G networks. This architecture allows the bi-directional flow of power and information between schedulable EVs and smart grids (SGs) to improve the quality of service and cost-effectiveness of energy service providers. However, it is very important to select an EV suitable for scheduling. In order to improve the efficiency of scheduling, we first need to determine define categories of target EV users. We found that grouping on the basis of EV charging behavior is one effective method to identify target EVs. Therefore, we propose a hybrid artificial intelligence classification method based on the charging behavior profile of EVs. Through this classification method, target EVs can be accurately identified. The results of cross-validation experiments and performance evaluations suggest that this method is effective.

show abstract

Local Search Yields Approximation Schemes for k-Means and k-Median in Euclidean and Minor-Free Metrics

Cited by 69 publications

References 42 publications

Near-Optimal Explainable $k$-Means for All Dimensions

Near-Optimal Explainable $k$-Means for All Dimensions

Almost Tight Approximation Algorithms for Explainable Clustering

EV Charging Behavior Analysis Using Hybrid Intelligence for 5G Smart Grid

Contact Info

Product

Resources

About