A New Combinatorial Design of Coded Distributed Computing

Woolsey, Nicholas; Chen, Rongrong; Ji, Mingyue

doi:10.1109/isit.2018.8437323

Cited by 35 publications

(61 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Define P k m k −l k 1−l k as the surplus computation ratio of node k. In this example, we have [P 1 , P 2 , P 3 , P 4 ] = [0, 1 11 , 1 11 , 7 22 ]. In the second step, we further partition each batch N k , for k ∈ [4],…”

Section: ) Map Phase Designmentioning

confidence: 99%

“…2) Shuffle phase design: After the Map phase, each node k, for k ∈ [4], needs the IVs of the other (1−m k )N files to compute the Reduce functions of its assigned functions W k . Meanwhile, it should distribute the IVs computed from its compulsory files N k to the requiring nodes.…”

Section: ) Map Phase Designmentioning

confidence: 99%

“…It leverages the redundant computation capabilities at nodes by carefully designing input file allocation in the Map phase so as to exploit coded multicasting opportunities and hence reduce communication load in the Shuffle phase. The optimal tradeoff between the computation load in the Map phase and the communication load in the Shuffle phase is derived in [3], which finds that increasing computation load of the Map phase by r can reduce communication load of the Shuffle phase by the same factor r. This idea of coded distributed computing has since been extended widely, e.g., [4]- [9]. In particular, [4], [5] propose new coded distributed computing schemes, [6] studies distributed computing with storage constraints at nodes, [7] studies distributed computing under time-varying excess computing resources, and [8], [9] studies the wireless distributed computing systems.…”

Section: Introductionmentioning

confidence: 99%

“…We aim to compare L i A,2 , for i ∈[4], to L Lower in (32) one by one, so as to obtain the multiplicative gapL A,2 L Lower .Lemma 2. When r ≤ K − 2, the multiplicative gap between L 1 A,2 and L Lower in (32) is within 20e, i.e.,…”

mentioning

confidence: 99%

See 3 more Smart Citations

Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment

Tao

2019

2019 IEEE Global Communications Conference (GLOBECOM)

View full text Add to dashboard Cite

This paper studies the computation-communication tradeoff in a heterogeneous MapReduce computing system where each distributed node is equipped with different computation capability. We first obtain an achievable communication load for any given computation load and any given function assignment at each node. The proposed file allocation strategy has two steps: first, the input files are partitioned into disjoint batches, each with possibly different size and computed by a distinct node; then, each node computes additional files from its non-computed files according to its redundant computation capability. In the Shuffle phase, coded multicasting opportunities are exploited thanks to the repetitive file allocation among different nodes. Based on this scheme, we further propose the computation-aware and the shuffle-aware function assignments. We prove that, by using proper function assignments, our achievable communication load for any given computation load is within a constant multiplicative gap to the optimum in an equivalent homogeneous system with the same average computation load. Numerical results show that our scheme with shuffle-aware function assignment achieves better computationcommunication tradeoff than existing works in some cases.coded multicasting opportunities are created as many as possible in the Shuffle phase to obtain the optimal computation-communication tradeoff. However, they only consider heterogeneous file allocation in the Map phase due to different storage size, and still assume homogeneous function assignment in the Reduce phase without taking the different computation capabilities across nodes into account. The authors in [13], [14] consider the heterogeneous systems where each node is assigned different number of output functions. Both works obtain an achievable communication load which is within a constant multiplicative gap to the optimum given the considered function assignment. They find that, by assigning more output functions to nodes with more input files, their proposed schemes even outperform the optimal scheme in an equivalent homogeneous system [3] in some cases. However, the heterogeneous systems considered in [13],[14] consist of multiple homogeneous systems where nodes in each system have the same storage and computation capabilities but differ from nodes in other systems, and is thus not suitable to

show abstract

Section: ) Map Phase Designmentioning

confidence: 99%

Section: ) Map Phase Designmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment

Tao

2019

2019 IEEE Global Communications Conference (GLOBECOM)

View full text Add to dashboard Cite

show abstract

“…For r = 8, we obtain a speedup of 1128. 16 26.22 ≈ 43.03. This is explained by the fact that in the previous scheme each server participates into much more groups and thus it needs to store more encoded data into its memory.…”

Section: Terasort Experimental Results and Discussionmentioning

confidence: 99%

Leveraging Coding Techniques for Speeding up Distributed Computing

Konstantinidis

Ramamoorthy

2018

2018 IEEE Global Communications Conference (GLOBECOM)

View full text Add to dashboard Cite

Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The servers also need to exchange intermediate values to complete the computation. Experimental evidence suggests that this so-called Shuffle phase can be a significant part of the overall execution time for several classes of jobs. Prior work has demonstrated a natural tradeoff between computation and communication whereby running redundant copies of jobs can reduce the Shuffle traffic load, thereby leading to reduced overall execution times. For a single job, the main drawback of this approach is that it requires the original job to be split into a number of files that grows exponentially in the system parameters. When extended to multiple jobs (with specific function types), these techniques suffer from a limitation of a similar flavor, i.e., they require an exponentially large number of jobs to be executed. In practical scenarios, these requirements can significantly reduce the promised gains of the method. In this work, we show that a class of combinatorial structures called resolvable designs can be used to develop efficient coded distributed computing schemes for both the single and multiple job scenarios considered in prior work. We present both theoretical analysis and exhaustive experimental results (on Amazon EC2 clusters) that demonstrate the performance advantages of our method. For the single and multiple job cases, we obtain speed-ups of 4.69x (and 2.6x over prior work) and 4.31x over the baseline approach, respectively.

show abstract

Cascaded Coded Distributed Computing Schemes Based on Placement Delivery Arrays

Jiang

2020

IEEE Access

View full text Add to dashboard Cite

Li et al. introduced coded distributed computing (CDC) scheme to reduce the communication load in general distributed computing frameworks such as MapReduce. They also proposed cascaded CDC schemes where each output function is computed multiple times, and proved that such schemes achieved the fundamental trade-off between computation load and communication load. However, these schemes require exponentially large numbers of input files and output functions when the number of computing nodes gets large. In this paper, by using the structure of placement delivery arrays (PDAs), we construct several infinite classes of cascaded CDC schemes. We also show that the numbers of output functions in all the new schemes are only a factor of the number of computing nodes, and the number of input files in our new schemes is much smaller than that of input files in CDC schemes derived by Li et al.

show abstract

A New Combinatorial Design of Coded Distributed Computing

Cited by 35 publications

References 40 publications

Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment

Heterogeneous Coded Distributed Computing: Joint Design of File Allocation and Function Assignment

Leveraging Coding Techniques for Speeding up Distributed Computing

Cascaded Coded Distributed Computing Schemes Based on Placement Delivery Arrays

Contact Info

Product

Resources

About