Songze Li scite author profile

Abstract-How can we optimally trade extra computing power to reduce the communication load in distributed computing? We answer this question by characterizing a fundamental tradeoff between computation and communication in distributed computing, i.e., the two are inversely proportional to each other.More specifically, a general distributed computing framework, motivated by commonly used structures like MapReduce, is considered, where the overall computation is decomposed into computing a set of "Map" and "Reduce" functions distributedly across multiple computing nodes. A coded scheme, named "Coded Distributed Computing" (CDC), is proposed to demonstrate that increasing the computation load of the Map functions by a factor of r (i.e., evaluating each function at r carefully chosen nodes) can create novel coding opportunities that reduce the communication load by the same factor.An information-theoretic lower bound on the communication load is also provided, which matches the communication load achieved by the CDC scheme. As a result, the optimal computation-communication tradeoff in distributed computing is exactly characterized.Finally, the coding techniques of CDC is applied to the Hadoop TeraSort benchmark to develop a novel CodedTeraSort algorithm, which is empirically demonstrated to speed up the overall job execution by 1.97× -3.39×, for typical settings of interest.

show abstract

A Unified Coding Framework for Distributed Computing with Straggling Servers

Maddah-Ali

Avestimehr

2016

179

213

View full text Add to dashboard Cite

Abstract-We propose a unified coded framework for distributed computing with straggling servers, by introducing a tradeoff between "latency of computation" and "load of communication" for some linear computation tasks. We show that the coded scheme of [1]-[3] that repeats the intermediate computations to create coded multicasting opportunities to reduce communication load, and the coded scheme of [4], [5] that generates redundant intermediate computations to combat against straggling servers can be viewed as special instances of the proposed framework, by considering two extremes of this tradeoff: minimizing either the load of communication or the latency of computation individually. Furthermore, the latencyload tradeoff achieved by the proposed coded framework allows to systematically operate at any point on that tradeoff to perform distributed computing tasks. We also prove an informationtheoretic lower bound on the latency-load tradeoff, which is shown to be within a constant multiplicative gap from the achieved tradeoff at the two end points.

show abstract

Coded MapReduce

2015

View full text Add to dashboard Cite

MapReduce is a commonly used framework for executing data-intensive tasks on distributed server clusters. We present "Coded MapReduce", a new framework that enables and exploits a particular form of coding to significantly reduce the inter-server communication load of MapReduce. In particular, Coded MapReduce exploits the repetitive mapping of data blocks at different servers to create coded multicasting opportunities in the shuffling phase, cutting down the total communication load by a multiplicative factor that grows linearly with the number of servers in the cluster. We also analyze the tradeoff between the "computation load" and the "communication load" of the Coded MapReduce.

show abstract

Coded Merkle Tree: Solving Data Availability Attacks in Blockchains

Sahraei

Li³

et al. 2020

120

View full text Add to dashboard Cite

A Fundamental Tradeoff Between Computation and Communication in Distributed Computing

Maddah-Ali

et al. 2018

IEEE Trans. Inform. Theory

376

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Songze Li

Fundamental tradeoff between computation and communication in distributed computing

A Unified Coding Framework for Distributed Computing with Straggling Servers

Coded MapReduce

Coded Merkle Tree: Solving Data Availability Attacks in Blockchains

A Fundamental Tradeoff Between Computation and Communication in Distributed Computing

Contact Info

Product

Resources

About