2019 IEEE International Symposium on Information Theory (ISIT) 2019
DOI: 10.1109/isit.2019.8849317
|View full text |Cite
|
Sign up to set email alerts
|

Coded Matrix Multiplication on a Group-Based Model

Abstract: Coded distributed computing has been considered as a promising technique which makes large-scale systems robust to the "straggler" workers. Yet, practical system models for distributed computing have not been available that reflect the clustered or grouped structure of real-world computing servers. Neither the large variations in the computing power and bandwidth capabilities across different servers have been properly modeled. We suggest a group-based model to reflect practical conditions and develop an appro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(21 citation statements)
references
References 20 publications
0
21
0
Order By: Relevance
“…While most CDC schemes consider homogeneous computing nodes, there have been a few recent studies that investigated CDC over heterogeneous computing clusters. In particular, Kim et al [32], [33] considered the matrixvector multiplication problem and presented an optimal load allocation method that achieves a lower bound of the expected latency. Reisizadeh et al [21] introduced a different approach, namely Heterogeneous Coded Matrix Multiplication (HCMM), that can maximize the expected computing results aggregated at the master node.…”
Section: Related Workmentioning
confidence: 99%
“…While most CDC schemes consider homogeneous computing nodes, there have been a few recent studies that investigated CDC over heterogeneous computing clusters. In particular, Kim et al [32], [33] considered the matrixvector multiplication problem and presented an optimal load allocation method that achieves a lower bound of the expected latency. Reisizadeh et al [21] introduced a different approach, namely Heterogeneous Coded Matrix Multiplication (HCMM), that can maximize the expected computing results aggregated at the master node.…”
Section: Related Workmentioning
confidence: 99%
“…Since the following equality holds PS can obtain the full gradient receiving the computation results from all the workers. In contrast to the naive approach, coded computation schemes for distributed matrix multiplication [ 22 , 23 , 32 , 34 ] first encode the submatrices, and then assign them to the workers to achieve a certain tolerance against slow/straggling workers.…”
Section: An Overview Of Existing Straggler Avoidance Techniquesmentioning
confidence: 99%
“…A wealth of straggler avoidance techniques have been proposed in recent years for DGD as well as other distributed computation tasks [ 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 ]. The common design notion behind all these schemes is the assignment of redundant computations/tasks to workers, such that faster workers can compensate for the stragglers.…”
Section: Introductionmentioning
confidence: 99%
“…In practical distributed computing systems, some processing nodes have the same computational capabilities, in terms of the same distributions of computation time, and thus they can be grouped together. By exploiting the group structure and heterogeneities among different groups of processing nodes [141], [142], the implementation of a combination of group codes and an optimal load allocation strategy not only approaches the optimal computation time that is achieved by the MDS codes, but also has low decoding complexity. In addition, by varying the number of allocated rows of the matrix to the workers [142], the computation latency can be reduced by orders of magnitude over the MDS codes with fixed computation load allocation [141] as the number of workers increases.…”
Section: A Computation Load Allocationmentioning
confidence: 99%
“…By exploiting the group structure and heterogeneities among different groups of processing nodes [141], [142], the implementation of a combination of group codes and an optimal load allocation strategy not only approaches the optimal computation time that is achieved by the MDS codes, but also has low decoding complexity. In addition, by varying the number of allocated rows of the matrix to the workers [142], the computation latency can be reduced by orders of magnitude over the MDS codes with fixed computation load allocation [141] as the number of workers increases. The load allocation strategy proposed in [142] focuses mainly on the design of an optimal MDS code.…”
Section: A Computation Load Allocationmentioning
confidence: 99%