2021
DOI: 10.1007/978-3-030-85665-6_22
|View full text |Cite
|
Sign up to set email alerts
|

Algorithm Design for Tensor Units

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…However, summing up T a [1][1] and T b [1][1] will lead to incorrect result. This is because the first column of P b are the coefficients required by the reduction of the second row in [5] is excepted), other than the first row. Thus, the matrix A ′ cannot be multiplied directly by parameter matrix as in star stencil.…”
Section: Adapting Box Stencil On Tcumentioning
confidence: 99%
See 2 more Smart Citations
“…However, summing up T a [1][1] and T b [1][1] will lead to incorrect result. This is because the first column of P b are the coefficients required by the reduction of the second row in [5] is excepted), other than the first row. Thus, the matrix A ′ cannot be multiplied directly by parameter matrix as in star stencil.…”
Section: Adapting Box Stencil On Tcumentioning
confidence: 99%
“…As shown in Figure 9, after each GEMM operation, we move A ′ one row down along the input mesh. For example, to update point M [5][3], the first GEMM operation starts at the first row of the input mesh, where T a [3] Similar to star stencil (Figure 4 and Figure 5), all points of the inner region with size of L + 2r × L can be calculated by 2r + 1 GEMMs operations on TCU for box stencil, as shown in Figure 9, and the points of the boundary region can be updated with partial weighted reductions through GEMMs operations (the other partial weighted reductions are calculated using FMA on CUDA cores).…”
Section: Adapting Box Stencil On Tcumentioning
confidence: 99%
See 1 more Smart Citation