Proceedings of the 2006 Workshop on Memory System Performance and Correctness 2006
DOI: 10.1145/1178597.1178605
|View full text |Cite
|
Sign up to set email alerts
|

Implicit and explicit optimizations for stencil computations

Abstract: Stencil-based kernels constitute the core of many scientific applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance, due primarily to the disparity between processor and main memory speeds. We examine several optimizations on both the conventional cache-based memory systems of the Itanium 2, Opteron, and Power5, as well as the heterogeneous multicore design of the Cell processor. The optimizations target cache reuse across stencil sweeps, including both an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
79
0

Year Published

2008
2008
2017
2017

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 105 publications
(80 citation statements)
references
References 11 publications
1
79
0
Order By: Relevance
“…A number of works have addressed optimizations of stencil computations on emerging multicore platforms [7], [16], [17], [6], [27], [26], [11], [37], [10], [4], [9], [40], [38], [41], [8], [39]. In addition, other transformations such as tiling of stencil computations for multicore architectures have been addressed in [43], [25], [21], [34].…”
Section: Related Workmentioning
confidence: 99%
“…A number of works have addressed optimizations of stencil computations on emerging multicore platforms [7], [16], [17], [6], [27], [26], [11], [37], [10], [4], [9], [40], [38], [41], [8], [39]. In addition, other transformations such as tiling of stencil computations for multicore architectures have been addressed in [43], [25], [21], [34].…”
Section: Related Workmentioning
confidence: 99%
“…Periodic domains have also been tiled using rhombus shaped tiles [5,8,10]. These tiles overlap at their bases, causing two tiles to compute some duplicate results.…”
Section: Related Workmentioning
confidence: 99%
“…These benefits can be enough to overcome the extra time spent recomputing a portion of the iterations. Overlapping tiles can also be used to handle wraparound dependencies that are introduced due to periodic boundaries [5,8]. The presented techniques could feasibly be extended to handle the ring, cylinder, and torus domains.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations