A cost-effective implementation of multilevel tiling

Jimenez, M Manuel; Llaberia, J.M.; Fernandez, A.

doi:10.1109/tpds.2003.1239869

Cited by 10 publications

(7 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Also, note that actual tile sizes are a product of all inner tile sizes because tiling at level k is a tiling on the (k + 1) tiled space, not the original iteration space. Although this formulation is a direct extension of Xue's definition of single level tiling [33], to the best of our knowledge, this is first formalization and presentation of itother formulations [15] of multi-level tiling are based on the strip-mine and interchange view of tiling. Now given the fact that this set P m tiled is a polyhedron, the scanning loops can be easily generated by existing tools, such as omega test and cloog.…”

Section: Multi-level Tiling For Fixed Tile Sizesmentioning

confidence: 99%

“…The experimental results in Section 6.1 show how this exponential growth with respect to number of levels renders the technique inapplicable beyond two levels of tiling. The multi-level tiled loop generation method proposed by Jiminéz et al [15] has an exponential time complexity at each level of tiling, and this grows linearly with the number of levels of tiling.…”

Section: Complexity and Scalability Of The Algorithmmentioning

confidence: 99%

“…However, several important applications such as numerical linear algebra routines (e.g., lu decomposition, triangular matrix product, symmetric rank updates from blas) and stencil computations (after applying skewing to enable tiling) have non-rectangular iteration spaces. Multi-level tiling has been shown to be very useful for them [14,15,27].…”

Section: Introductionmentioning

confidence: 99%

“…We are aware of only one solution that can generate arbitrary levels of multi-level tiled code for general polyhedral iteration spaces [15]. Their technique is limited to the case when tile sizes are fixed at compile (tiled loop generation) time, which is a severe limitation in the situations discussed above.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Multi-level tiling

Kim

Renganarayanan

Rostron

et al. 2007

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing

View full text Add to dashboard Cite

Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. High-performance implementations use multiple levels of tiling to exploit the hierarchy of parallelism and cache/register locality. Efficient generation of multi-level tiled code is essential for effective use of multi-level tiling. Parameterized tiled code, where tile sizes are not fixed but left as symbolic parameters can enable several dynamic and run-time optimizations. Previous solutions to multi-level tiled loop generation are limited to the case where tile sizes are fixed at compile time. We present an algorithm that can generate multi-level parameterized tiled loops at the same cost as generating single-level tiled loops. The efficiency of our method is demonstrated on several benchmarks. We also present a method-useful in register tiling-for separating partial and full tiles at any arbitrary level of tiling. The code generator we have implemented is available as an open source tool.

show abstract

Section: Multi-level Tiling For Fixed Tile Sizesmentioning

confidence: 99%

Section: Complexity and Scalability Of The Algorithmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multi-level tiling

Kim

Renganarayanan

Rostron

et al. 2007

Proceedings of the 2007 ACM/IEEE Conference on Supercomputing

View full text Add to dashboard Cite

show abstract

“…Multi-level tiling has become a key technique for high-performance computation. There has been work on generating efficient multi-level tiled code for polyhedral iteration spaces that handle tile sizes at compile time [23] and that handle tile sizes as symbolic parameters [26].…”

Section: Related Workmentioning

confidence: 99%

Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories

Baskaran

Bondhugula

Krishnamoorthy

et al. 2008

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

View full text Add to dashboard Cite

Several parallel architectures such as GPUs and the Cell processor have fast explicitly managed on-chip memories, in addition to slow off-chip memory. They also have very high computational power with multiple levels of parallelism. A significant challenge in programming these architectures is to effectively exploit the parallelism available in the architecture and manage the fast memories to maximize performance.In this paper we develop an approach to effective automatic data management for on-chip memories, including creation of buffers in on-chip (local) memories for holding portions of data accessed in a computational block, automatic determination of array access functions of local buffer references, and generation of code that moves data between slow off-chip memory and fast local memories. We also address the problem of mapping computation in regular programs to multi-level parallel architectures using a multi-level tiling approach, and study the impact of on-chip memory availability on the selection of tile sizes at various levels. Experimental results on a GPU demonstrate the effectiveness of the proposed approach.

show abstract