We consider the problem of maintaining frequency counts for items occurring frequently in the union of multiple distributed data streams. Naive methods of combining approximate frequency counts from multiple nodes tend to result in excessively large data structures that are costly to transfer among nodes. To minimize communication requirements, the degree of precision maintained by each node while counting item frequencies must be managed carefully. We introduce the concept of a precision gradient for managing precision when nodes are arranged in a hierarchical communication structure. We then study the optimization problem of how to set the precision gradient so as to minimize communication, and provide optimal solutions that minimize worst-case communication load over all possible inputs. We then introduce a variant designed to perform well in practice, with input data that does not conform to worstcase characteristics. We verify the effectiveness of our approach empirically using real-world data, and show that our methods incur substantially less communication than naive approaches while providing the same error guarantees on answers.In addition, we extend techniques for maintaining frequency counts of high-frequency items in one or more streams by making them time-sensitive. Time-sensitivity is achieved by associating weights with items that decay exponentially with time. We analyze the error bounds and worst-case space bounds for the extended algorithms.
Robust optimization has traditionally focused on uncertainty in data and costs in optimization problems to formulate models whose solutions will be optimal in the worstcase among the various uncertain scenarios in the model. While these approaches may be thought of defining data-or cost-robust problems, we formulate a new "demand-robust" model motivated by recent work on two-stage stochastic optimization problems. We propose this in the framework of general covering problems and prove a general structural lemma about special types of first-stage solutions for such problems: there exists a first-stage solution that is a minimal feasible solution for the union of the demands for some subset of the scenarios and its objective function value is no more than twice the optimal. We then provide approximation algorithms for a variety of standard discrete covering problems in this setting, including minimum cut, minimum multi-cut, shortest paths, Steiner trees, vertex cover and un-capacitated facility location. While many of our results draw from rounding approaches recently developed for stochastic programming problems, we also show new applications of old metric rounding techniques for cut problems in this demand-robust setting.
We consider the problem of embedding finite metrics with slack: we seek to produce embeddings with small dimension and distortion while allowing a (small)
We consider the problem of minimizing the total weighted flow time on a single machine with preemptions. We give an online algorithm that is O(k)-competitive for k weight classes. This implies an O(log W )-competitive algorithm, where W is the maximum to minimum ratio of weights. This algorithm also implies an O(log n + log P)-approximation ratio for the problem, where P is the ratio of the maximum to minimum job size and n is the number of jobs. We also consider the nonclairvoyant setting where the size of a job is unknown upon its arrival and becomes known to the scheduler only when the job meets its service requirement. We consider the resource augmentation model, and give a (1 + )-speed, (1 + 1/ )-competitive online algorithm.
Abstract. We consider the undirected minimum spanning tree problem in a stochastic optimization setting. For the two-stage stochastic optimization formulation with finite scenarios, a simple iterative randomized rounding method on a natural LP formulation of the problem yields a nearly best-possible approximation algorithm.We then consider the Stochastic minimum spanning tree problem in a more general black-box model and show that even under the assumptions of bounded inflation the problem remains log n-hard to approximate unless P = NP ; where n is the size of graph. We also give approximation algorithm matching the lower bound up to a constant factor.Finally, we consider a slightly different cost model where the second stage costs are independent random variables uniformly distributed between [0,1]. We show that a simple thresholding heuristic has cost bounded by the optimal cost plus ζ(3) 4+ o(1).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.