1995
DOI: 10.1155/1995/636457
|View full text |Cite
|
Sign up to set email alerts
|

A Tensor Product Formulation of Strassen′s Matrix Multiplication Algorithm with Memory Reduction

Abstract: In this article, we present a program generation strategy of Strassen's matrix multiplication algorithm using a programming methodology based on tensor product formulas. In this methodology, block recursive programs such as the fast Fourier Transforms and Strassen's matrix multiplication algorithm are expressed as algebraic formulas involving tensor products and other matrix operations. Such formulas can be systematically translated to high-performance parallel/vector codes for various architectures. In this a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

1999
1999
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Strassen's fast matrix multiplication algorithm has been implemented for both shared-memory [8,21] and distributed-memory architectures [2,11,24]. For our parallel algorithms in Section 4, we use the ideas of breadth-first and depth-first traversals of the recursion trees, which were first considered by Kumar et al [21] and Ballard et al [2] for minimizing memory footprint and communication.…”
Section: Related Workmentioning
confidence: 99%
“…Strassen's fast matrix multiplication algorithm has been implemented for both shared-memory [8,21] and distributed-memory architectures [2,11,24]. For our parallel algorithms in Section 4, we use the ideas of breadth-first and depth-first traversals of the recursion trees, which were first considered by Kumar et al [21] and Ballard et al [2] for minimizing memory footprint and communication.…”
Section: Related Workmentioning
confidence: 99%
“…Kumar, Huang, Johnson, and Sadayappan [24] implemented Strassen's algorithm on a shared-memory machine. They identified the tradeoff between available parallelism and total memory footprint by differentiating between "partial" and "complete" evaluation of the algorithm, which corresponds to what we call depth-first and breadth-first traversal of the recursion tree (see Section 3.1).…”
Section: Previous Work On Parallel Strassenmentioning
confidence: 99%
“…An extensive literature exists on parallelizing naive matrix multiplication algorithms [9], [10], [11], [12], [13], [6], [14], [15], [16], [17], [5], [18], [19], [20] and [21]. Similarly Strassen's matrix multiplication algorithm has also been extensively studied for parallelization [22], [23], [24], [25], [16], [26], [27], [28], [29], [30] and [31]. As pointed out in [6], [20] and [30], the literature on parallel and distributed matrix multiplication can be divided broadly into three categories: 1) Grid based approach, 2) BFS/DFS based approach and 3) Hadoop and Spark based approach.…”
Section: Related Workmentioning
confidence: 99%