2013 IEEE 27th International Symposium on Parallel and Distributed Processing 2013
DOI: 10.1109/ipdps.2013.80
|View full text |Cite
|
Sign up to set email alerts
|

Communication-Optimal Parallel Recursive Rectangular Matrix Multiplication

Abstract: Abstract-Communication-optimal algorithms are known for square matrix multiplication. Here, we obtain the first communication-optimal algorithm for all dimensions of rectangular matrices. Combining the dimension-splitting technique of Frigo, Leiserson, Prokop and Ramachandran (1999) with the recursive BFS/DFS approach of Ballard, Demmel, Holtz, Lipshitz and Schwartz (2012) allows for a communication-optimal as well as cache-and network-oblivious algorithm. Moreover, the implementation is simple: approximately … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
101
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 100 publications
(103 citation statements)
references
References 29 publications
2
101
0
Order By: Relevance
“…We also present a new 3D recursive algorithm which is a parallelization of a sequential recursive algorithm using the techniques of [3,13]. Although we have assumed that the input matrices are square, the recursive algorithm will use rectangular matrices for subproblems.…”
Section: D Recursive Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…We also present a new 3D recursive algorithm which is a parallelization of a sequential recursive algorithm using the techniques of [3,13]. Although we have assumed that the input matrices are square, the recursive algorithm will use rectangular matrices for subproblems.…”
Section: D Recursive Algorithmmentioning
confidence: 99%
“…We obtain two new communication-optimal algorithms. Our 3D iterative and recursive algorithms (see Sections 4.3 and 4.4) are adaptations of dense ones [13,25], though an important distinction is that the sparse algorithms do not require extra local memory to minimize communication. We also optimize an existing algorithm, Sparse SUMMA, to be communication-optimal in some cases.…”
Section: Introductionmentioning
confidence: 99%
“…In a memory constrained case (2D), each processor needs to hold n 2 / √ p portion of the data. In the communication optimal case, each processor should have enough memory to hold a larger partition of the input matrices (A and B) estimated by n 2 /p 2/3 words [26]. Having such portions of data could involve at most n 2 /p 2/3…”
Section: Analytical Estimate Of the Data Movement Of The Proxy Benchmarkmentioning
confidence: 99%
“…In 2012, Ballard [10] proposed a communication-optimal Strassen's algorithm which performs better than any previous classical or Strassenbased parallel algorithm for square matrix multiplication. Later, Demmel [39] proposed a communication-optimal recursive algorithm for rectangular matrix multiplication, applying communication-optimal theory to all matrix dimensions.…”
Section: Related Workmentioning
confidence: 99%