2015
DOI: 10.1145/2858788.2688513
|View full text |Cite
|
Sign up to set email alerts
|

A framework for practical parallel fast matrix multiplication

Abstract: Matrix multiplication is a fundamental computation in many scientific disciplines. In this paper, we show that novel fast matrix multiplication algorithms can significantly outperform vendor implementations of the classical algorithm and Strassen's fast algorithm on modest problem sizes and shapes. Furthermore, we show that the best choice of fast algorithm depends not only on the size of the matrices but also the shape. We develop a code generation tool to automatically implement multiple sequential and share… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
81
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 50 publications
(82 citation statements)
references
References 41 publications
0
81
1
Order By: Relevance
“…We also use heuristics in order to encourage sparsity in the solutions, and a fortunate by-product of the sparsity is that the nonzero values often tend towards a discrete set of values from which an exact decomposition can be recognized. Our methods are based on techniques that have proved successful in discovering generic exact decompositions (those that have no noticeable symmetries) [1,22]; we summarize this approach in Section 3.1. Our search process can be divided into two phases.…”
Section: Discussion On the Numerical Methods Usedmentioning
confidence: 99%
See 2 more Smart Citations
“…We also use heuristics in order to encourage sparsity in the solutions, and a fortunate by-product of the sparsity is that the nonzero values often tend towards a discrete set of values from which an exact decomposition can be recognized. Our methods are based on techniques that have proved successful in discovering generic exact decompositions (those that have no noticeable symmetries) [1,22]; we summarize this approach in Section 3.1. Our search process can be divided into two phases.…”
Section: Discussion On the Numerical Methods Usedmentioning
confidence: 99%
“…We discuss effective choices for the regularization parameters in Section 3.3. This method works for the cases of non-square matrix multiplication and has been used to discover exact rank decompositions for many small cases [1,22], but it does not encourage solutions to reflect any symmetries.…”
Section: Discussion On the Numerical Methods Usedmentioning
confidence: 99%
See 1 more Smart Citation
“…Ballard, Demmel, Holtz, Lipshitz, and Schwartz [15], Ballard, Demmel, Holtz, and Schwartz [16], and Lipshitz, Ballard, Demmel, and Schwartz [77]. Recent engineering work includes Benson and Ballard [23] and Huang, Rice, Matthews, and van de Geijn [65]. Our work differs from these works in that we seek a self-contained proof-of-concept demonstration requiring good-performance finite-field matrix multiplication on GPUs, but we do not necessarily seek the most optimized possible implementation; such optimizations are left for future work.…”
Section: Fast Matrix Multiplicationmentioning
confidence: 99%
“…When q = 2, we use F2 and {0, 1} interchangeably throughout the paper 13. In fact, error correcting codes, as well as constructing new codes out of existing codes by concatenations to be discussed shortly, can be defined more generally over an arbitrary set of q distinct elements called alphabet of the code.…”
mentioning
confidence: 99%