Alan Roytman scite author profile

One of the central problems in data-analysis is k-means clustering. In recent years, considerable attention in the literature addressed the streaming variant of this problem, culminating in a series of results (Har-Peled and Mazumdar; Frahling and Sohler; Frahling, Monemizadeh, and Sohler; Chen) that produced a (1 + ε)-approximation for k-means clustering in the streaming setting. Unfortunately, since optimizing the k-means objective is Max-SNP hard, all algorithms that achieve a (1 + ε)-approximation must take time exponential in k unless P=NP.Thus, to avoid exponential dependence on k, some additional assumptions must be made to guarantee high quality approximation and polynomial running time. A recent paper of Ostrovsky, Rabani, Schulman, and Swamy (FOCS 2006) introduced the very natural assumption of data separability: the assumption closely reflects how k-means is used in practice and allowed the authors to create a high-quality approximation for k-means clustering in the non-streaming setting with polynomial running time even for large values of k.Their work left open a natural and important question: are similar results possible in a streaming setting? This is the question we answer in this paper, albeit using substantially different techniques.We show a near-optimal streaming approximation algorithm for k-means in high-dimensional Euclidean space with sublinear memory and a single pass, under the same data separability assumption. Our algorithm offers significant improvements in both space and run- ning time over previous work while yielding asymptotically best-possible performance (assuming that the running time must be fully polynomial and P = NP).The novel techniques we develop along the way imply a number of additional results: we provide a high-probability performance guarantee for online facility location (in contrast, Meyerson's FOCS 2001 algorithm gave bounds only in expectation); we develop a constant approximation method for the general class of semi-metric clustering problems; we improve (even without σ-separability) by a logarithmic factor space requirements for streaming constant-approximation for k-median; finally we design a "re-sampling method" in a streaming setting to convert any constant approximation for clustering to a [1 + O(σ 2 )]-approximation for σ-separable data.

show abstract

Online Multidimensional Load Balancing

Meyerson

Roytman

Tagiku

2013

View full text Add to dashboard Cite

Liquid Price of Anarchy

Azar

Feldman

Gravin

et al. 2017

View full text Add to dashboard Cite

Incorporating budget constraints into the analysis of auctions has become increasingly important, as they model practical settings more accurately. The social welfare function, which is the standard measure of efficiency in auctions, is inadequate for settings with budgets, since there may be a large disconnect between the value a bidder derives from obtaining an item and what can be liquidated from her. The Liquid Welfare objective function has been suggested as a natural alternative for settings with budgets. Simple auctions, like simultaneous item auctions, are evaluated by their performance at equilibrium using the Price of Anarchy (PoA) measure -the ratio of the objective function value of the optimal outcome to the worst equilibrium. Accordingly, we evaluate the performance of simultaneous item auctions in budgeted settings by the Liquid Price of Anarchy (LPoA) measure -the ratio of the optimal Liquid Welfare to the Liquid Welfare obtained in the worst equilibrium.Our main result is that the LPoA for mixed Nash equilibria is bounded by a constant when bidders are additive and items can be divided into sufficiently many discrete parts. Our proofs are robust, and can be extended to achieve similar bounds for simultaneous second price auctions as well as Bayesian Nash equilibria. For pure Nash equilibria, we establish tight bounds on the LPoA for the larger class of fractionally-subadditive valuations. To derive our results, we develop a new technique in which some bidders deviate (surprisingly) toward a non-optimal solution. In particular, this technique does not fit into the smoothness framework.

show abstract

Makespan Minimization via Posted Prices

Feldman

Fiat

Roytman

2017

View full text Add to dashboard Cite

We consider job scheduling settings, with multiple machines, where jobs arrive online and choose a machine selfishly so as to minimize their cost. Our objective is the classic makespan minimization objective, which corresponds to the completion time of the last job to complete. The incentives of the selfish jobs may lead to poor performance. To reconcile the differing objectives, we introduce posted machine prices. The selfish job seeks to minimize the sum of its completion time on the machine and the posted price for the machine. Prices may be static (i.e., set once and for all before any arrival) or dynamic (i.e., change over time), but they are determined only by the past, assuming nothing about upcoming events. Obviously, such schemes are inherently truthful.We consider the competitive ratio: the ratio between the makespan achievable by the pricing scheme and that of the optimal algorithm. We give tight bounds on the competitive ratio for both dynamic and static pricing schemes for identical, restricted, related, and unrelated machine settings. Our main result is a dynamic pricing scheme for related machines that gives a constant competitive ratio, essentially matching the competitive ratio of online algorithms for this setting. In contrast, dynamic pricing gives poor performance for unrelated machines. This lower bound also exhibits a gap between what can be achieved by pricing versus what can be achieved by online algorithms.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alan Roytman

Packing Small Vectors

Streaming k-means on Well-Clusterable Data

Online Multidimensional Load Balancing

Liquid Price of Anarchy

Makespan Minimization via Posted Prices

Contact Info

Product

Resources

About