2015 44th International Conference on Parallel Processing 2015
DOI: 10.1109/icpp.2015.106
|View full text |Cite
|
Sign up to set email alerts
|

Generating Efficient Tensor Contractions for GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 33 publications
(24 citation statements)
references
References 22 publications
0
23
0
Order By: Relevance
“…[Einstein 1916] "We therefore introduce the following rule: when an index appears twice in a term of an expression, one shall always sum over it, unless the opposite is noted explicitly." (authors' translation) In addition to the importance of Einstein's convention for mathematical notation, it serves as the basis for elegant domain-specific languages [Åhlander 2002;Nelson et al 2015;Solomonik et al 2013]. For example, in the Cyclops tensor framework one may write [Solomonik et al 2013]:…”
Section: High-level Language and Representationmentioning
confidence: 99%
See 1 more Smart Citation
“…[Einstein 1916] "We therefore introduce the following rule: when an index appears twice in a term of an expression, one shall always sum over it, unless the opposite is noted explicitly." (authors' translation) In addition to the importance of Einstein's convention for mathematical notation, it serves as the basis for elegant domain-specific languages [Åhlander 2002;Nelson et al 2015;Solomonik et al 2013]. For example, in the Cyclops tensor framework one may write [Solomonik et al 2013]:…”
Section: High-level Language and Representationmentioning
confidence: 99%
“…Existing open-source software packages focus on binary tensor contractions [Li et al 2015;Matthews 2018;Shi et al 2016;Solomonik et al 2013;Springer and Bientinesi 2018], GPUs [Nelson et al 2015], only support tensors up to order 2 (matrices) [Spampinato et al 2018;Spampinato and Püschel 2014;Uphoff and Bader 2016], or focus on loop transformations [Kempf et al 2018;Luporini et al 2015;Stock et al 2011], where the latter lack support for sparse matrices in elementlocal operators and are to our understanding not designed for use with code generators for small GEMMs.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, GPUs have been increasingly adopted to accelerate diverse tensor computations. Some works focused on accelerating specific tensor operations including tensor contraction [25,26], factorization [27], transpose [28,29], and tensor-matrix multiplication [30]. These works propose parallel tensor algorithms specifically optimized for the GPU architectures.…”
Section: Related Workmentioning
confidence: 99%
“…However, they focus on optimizing only limited number of tensor contraction kernels on extreme small size tensors. Other works in [1] [20] improve the tensor computation performance by doing loop reorganization and fusion.…”
Section: Introduction and Scopementioning
confidence: 99%