Abstract. We consider adding k shortcut edges (i.e. edges of small fixed length δ ≥ 0) to a graph so as to minimize the weighted average shortest path distance over all pairs of vertices. We explore several variations of the problem and give O(1)-approximations for each. We also improve the best known approximation ratio for metric k-median with penalties, as many of our approximations depend upon this bound. We give a (1 + 2, β)-approximation with runtime exponential in p. If we set β = 1 (to be exact on the number of medians), this matches the best current k-median (without penalties) result.
One of the central problems in data-analysis is k-means clustering. In recent years, considerable attention in the literature addressed the streaming variant of this problem, culminating in a series of results (Har-Peled and Mazumdar; Frahling and Sohler; Frahling, Monemizadeh, and Sohler; Chen) that produced a (1 + ε)-approximation for k-means clustering in the streaming setting. Unfortunately, since optimizing the k-means objective is Max-SNP hard, all algorithms that achieve a (1 + ε)-approximation must take time exponential in k unless P=NP.Thus, to avoid exponential dependence on k, some additional assumptions must be made to guarantee high quality approximation and polynomial running time. A recent paper of Ostrovsky, Rabani, Schulman, and Swamy (FOCS 2006) introduced the very natural assumption of data separability: the assumption closely reflects how k-means is used in practice and allowed the authors to create a high-quality approximation for k-means clustering in the non-streaming setting with polynomial running time even for large values of k.Their work left open a natural and important question: are similar results possible in a streaming setting? This is the question we answer in this paper, albeit using substantially different techniques.We show a near-optimal streaming approximation algorithm for k-means in high-dimensional Euclidean space with sublinear memory and a single pass, under the same data separability assumption. Our algorithm offers significant improvements in both space and run- ning time over previous work while yielding asymptotically best-possible performance (assuming that the running time must be fully polynomial and P = NP).The novel techniques we develop along the way imply a number of additional results: we provide a high-probability performance guarantee for online facility location (in contrast, Meyerson's FOCS 2001 algorithm gave bounds only in expectation); we develop a constant approximation method for the general class of semi-metric clustering problems; we improve (even without σ-separability) by a logarithmic factor space requirements for streaming constant-approximation for k-median; finally we design a "re-sampling method" in a streaming setting to convert any constant approximation for clustering to a [1 + O(σ 2 )]-approximation for σ-separable data.
In modern data centers and cloud computing systems, jobs often require resources distributed across nodes providing a wide variety of services. Motivated by this, we study the Coupled Placement problem, in which we place jobs into computation and storage nodes with capacity constraints, so as to optimize some costs or profits associated with the placement. The coupled placement problem is a natural generalization of the widely-studied generalized assignment problem (GAP), which concerns the placement of jobs into single nodes providing one kind of service. We also study a further generalization, the k-Sided Placement problem, in which we place jobs into k-tuples of nodes, each node in a tuple offering one of k services. For both the coupled and k-sided placement problems, we consider minimization and maximization versions. In the minimization versions (MinCP and MinkSP), the goal is to achieve minimum placement cost, while incurring a minimum blowup in the capacity of the individual nodes. Our first main result is an algorithm for MinkSP that achieves optimal cost while increasing capacities by at most a factor of k + 1, also yielding the first constant-factor approximation for MinCP. In the maximization versions (MaxCP and MaxkSP), the goal is to maximize the total weight of the jobs that are placed under hard capacity constraints. MaxkSP can be expressed as a k-column sparse integer program, and can be approximated to within a factor of O(k) 2 Madhukar Korupolu et al. factor using randomized rounding of a linear program relaxation. We consider alternative combinatorial algorithms that are much more efficient in practice. Our second main result is a local search based combinatorial algorithm that yields a 15-approximation and O(k 2)-approximation for MaxCP and MaxkSP respectively. Finally, we consider an online version of MaxkSP and present algorithms that achieve logarithmic competitive ratio under certain necessary technical assumptions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.