We consider the problem of maximizing a nonnegative submodular set function f : 2 N → R + subject to a p-matchoid constraint in the single-pass streaming setting. Previous work in this context has considered streaming algorithms for modular functions and monotone submodular functions. The main result is for submodular functions that are non-monotone. We describe deterministic and randomized algorithms that obtain a Ω( 1 p )-approximation using O(k log k)-space, where k is an upper bound on the cardinality of the desired set. The model assumes value oracle access to f and membership oracles for the matroids defining the p-matchoid constraint.
We consider the capacitated k-center problem. In this problem we are given a finite set of locations in a metric space and each location has an associated non-negative integer capacity. The goal is to choose (open) k locations (called centers) and assign each location to an open center to minimize the maximum, over all locations, of the distance of the location to its assigned center. The number of locations assigned to a center cannot exceed the center's capacity. The uncapacitated k-center problem has a simple tight 2-approximation from the 80's. In contrast, the first constant factor approximation for the capacitated problem was obtained only recently by Cygan, Hajiaghayi and Khuller who gave an intricate LP-rounding algorithm that achieves an approximation guarantee in the hundreds. In this paper we give a simple algorithm with a clean analysis and prove an approximation guarantee of 9. It uses the standard LP relaxation and comes close to settling the integrality gap (after necessary preprocessing), which is narrowed down to either 7, 8 or 9. The algorithm proceeds by first reducing to special tree instances, and then uses our best-possible algorithm to solve such instances. Our concept of tree instances is versatile and applies to natural variants of the capacitated k-center problem for which we also obtain improved algorithms. Finally, we give evidence to show that more powerful preprocessing could lead to better algorithms, by giving an approximation algorithm that beats the integrality gap for instances where all non-zero capacities are the same.
We study the problem of k-means clustering in the presence of outliers. The goal is to cluster a set of data points to minimize the variance of the points assigned to the same cluster, with the freedom of ignoring a small set of data points that can be labeled as outliers. Clustering with outliers has received a lot of attention in the data processing community, but practical, efficient, and provably good algorithms remain unknown for the most popular k-means objective. Our work proposes a simple local search-based algorithm for k-means clustering with outliers. We prove that this algorithm achieves constant-factor approximate solutions and can be combined with known sketching techniques to scale to large data sets. Using empirical evaluation on both synthetic and large-scale real-world data, we demonstrate that the algorithm dominates recently proposed heuristic approaches for the problem.
We consider the capacitated k-center problem. In this problem we are given a finite set of locations in a metric space and each location has an associated nonnegative integer capacity. The goal is to choose (open) k locations (called centers) and assign each location to an open center to minimize the maximum, over all locations, of the distance of the location to its assigned center. The number of locationsThe main result of this paper was obtained independently by An, Bhaskara and Svensson, and by Chekuri, Gupta and Madan. This paper is based on the manuscript of the first group.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.