Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures 2021
DOI: 10.1145/3409964.3461803
|View full text |Cite
|
Sign up to set email alerts
|

Fast Stencil Computations using Fast Fourier Transforms

Abstract: Stencil computations are widely used to simulate the change of state of physical systems across a multidimensional grid over multiple timesteps. The state-of-the-art techniques in this area fall into three groups: cache-aware tiled looping algorithms, cache-oblivious divide-and-conquer trapezoidal algorithms, and Krylov subspace methods.In this paper, we present two efficient parallel algorithms for performing linear stencil computations. Current direct solvers in this domain are computationally inefficient, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 145 publications
0
14
0
Order By: Relevance
“…State-of-the-arts Although recent studies on Stencil Dwarf only exhibit their own absolute performance without comparison in experiments [37,66,1], we reproduce artifacts exhaustively and compare the performance of different state-of-the-arts for a comprehensive analysis. Two classic vectorization methods (Auto Vectorization [35] and Data Reorganization [64]) are employed first as a standard baseline on CPUs.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…State-of-the-arts Although recent studies on Stencil Dwarf only exhibit their own absolute performance without comparison in experiments [37,66,1], we reproduce artifacts exhaustively and compare the performance of different state-of-the-arts for a comprehensive analysis. Two classic vectorization methods (Auto Vectorization [35] and Data Reorganization [64]) are employed first as a standard baseline on CPUs.…”
Section: Methodsmentioning
confidence: 99%
“…C(VR) Pluto [7] Diamond [9] AutoVec. C(VR) Folding [34] Polyhedral Folding C(VR) Tetris (CPU) Tessellate Skewed 2 C(VR) Brick [66] Brick Scatter G(CCU) AN5D [37] Temporal -G(CCU) Tetris (GPU) Checkerboard Trapezoid G(TCU) Tetris Polymorphic Template C(VR) + G(TCU) 1 For better clarity, CPU, GPU, Vector Register, CUDA Core Units and Tensor Core Units are abbreviated with C, G, VR, CCU and TCU respectively. 2 Notice we use a keyword as an abbreviation for some algorithms.…”
Section: State-of-the-art Comparisonmentioning
confidence: 99%
“…The value computed for the root node is the required option value. Straightforward iterative implementation of this method runs in time Θ 𝑇 2 . This method can be used for both European and American options.…”
Section: Symbolmentioning
confidence: 99%
“…A stencil is called linear if it computes the value of a cell at time step 𝑑 as a fixed linear combination of cell values at time steps before 𝑑, otherwise, it is called nonlinear. For 1D linear stencils Ahmad et al [2] provide FFT-based algorithms that take O (𝑇 log𝑇 ) time for periodic grids and O 𝑇 log 2 𝑇 time for aperiodic grids, assuming the size of the input grid to be Θ (𝑇 ).…”
Section: Symbolmentioning
confidence: 99%
See 1 more Smart Citation