[1993] Proceedings Seventh International Parallel Processing Symposium
DOI: 10.1109/ipps.1993.262814
|View full text |Cite
|
Sign up to set email alerts
|

A tensor product formulation of Strassen's matrix multiplication algorithm with memory reduction

Abstract: In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 9 publications
0
10
0
Order By: Relevance
“…In particular, numerical linear algebra based on Strassen's algorithm (if numerical stability issues have been considered acceptable) should clearly benefit from most of its results. Related work on the parallelization of the sub-cubic numerical linear algebra include [1,24,6,25,2].…”
Section: Introductionmentioning
confidence: 99%
“…In particular, numerical linear algebra based on Strassen's algorithm (if numerical stability issues have been considered acceptable) should clearly benefit from most of its results. Related work on the parallelization of the sub-cubic numerical linear algebra include [1,24,6,25,2].…”
Section: Introductionmentioning
confidence: 99%
“…There are several sequential implementations of Strassen's fast matrix multiplication algorithm [2,11,17], and parallel versions have been implemented for both shared-memory [9,25] and distributedmemory architectures [3,13]. For our parallel algorithms in Section 4, we use the ideas of breadth-first and depth-first traversals of the recursion trees, which were first considered by Kumar et al [25] and Ballard et al [3] for minimizing memory footprint and communication.…”
Section: Related Workmentioning
confidence: 99%
“…For our parallel algorithms in Section 4, we use the ideas of breadth-first and depth-first traversals of the recursion trees, which were first considered by Kumar et al [25] and Ballard et al [3] for minimizing memory footprint and communication.…”
Section: Related Workmentioning
confidence: 99%
“…The resulting algorithm has sequential complexity (n ω ), where ω = log N R. A BSP version of the algorithm was proposed in [12] (see also [9], [5], and [6]). The recursion tree is computed in breadth-first order.…”
Section: Fast Matrix Multiplicationmentioning
confidence: 99%