2007
DOI: 10.1145/1273442.1250780
|View full text |Cite
|
Sign up to set email alerts
|

Parameterized tiled loops for free

Abstract: Parameterized tiled loops-where the tile sizes are not fixed at compile time, but remain symbolic parameters until later-are quite useful for iterative compilers and "auto-tuners" that produce highly optimized libraries and codes. Tile size parameterization could also enable optimizations such as register tiling to become dynamic optimizations. Although it is easy to generate such loops for (hyper) rectangular iteration spaces tiled with (hyper) rectangular tiles, many important computations do not fall into t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0
5

Year Published

2011
2011
2017
2017

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 26 publications
(36 citation statements)
references
References 31 publications
0
31
0
5
Order By: Relevance
“…To the best of my knowledge there is no iterative compilation method including the optimizations presented in this paper with all their parameters; iterative compilation techniques either do not use the transformations presented in this paper at all, or they use some them to some extent [23] [24] [25], e.g., loop tiling is applied only for specific tile sizes and levels of tiling and loop unroll is applied only for specific unroll factor values. Normally, iterative compilation methods include transformations with low compilation time such as common subexpression elimination, unreachable code elimination, branch chaining and not compile time expensive transformations such as loop tiling; I show that if the transformations presented in Fig.1 (including almost all different transformation parameters) are included in iterative compilation, the search space is from 10 17 up to 10 29 schedules(for the given input sizes) ( Table 1); given that 1sec = 3.17 × 10…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…To the best of my knowledge there is no iterative compilation method including the optimizations presented in this paper with all their parameters; iterative compilation techniques either do not use the transformations presented in this paper at all, or they use some them to some extent [23] [24] [25], e.g., loop tiling is applied only for specific tile sizes and levels of tiling and loop unroll is applied only for specific unroll factor values. Normally, iterative compilation methods include transformations with low compilation time such as common subexpression elimination, unreachable code elimination, branch chaining and not compile time expensive transformations such as loop tiling; I show that if the transformations presented in Fig.1 (including almost all different transformation parameters) are included in iterative compilation, the search space is from 10 17 up to 10 29 schedules(for the given input sizes) ( Table 1); given that 1sec = 3.17 × 10…”
Section: Resultsmentioning
confidence: 99%
“…Iterative compilation techniques either do not use loop tiling and loop unroll transformations at all, or they use them only for specific tile sizes, levels of tiling and unroll factor values [23] [24] [25]. In [23], one level of tiling is used with tile sizes from 1 up to 100 and unroll factor values from 1 up to 20 (innermost iterator only).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Although production compilers today may have limited tiling capability, there have been significant recent advances in automatic source-to-source transformations for tiling and several systems for parametric tiling have been developed and made publicly available such as TLOG [24], HITLOG [18] and PrimeTile [15]. With such tiled-code generators, it is now possible to generate tiled code for compute-intensive inner kernels (including imperfectly nested loops), that can be tuned to the cache characteristics of the target platform.…”
Section: Parametric Tilingmentioning
confidence: 99%
“…Loop Tiling [7,17,23,29,35,36] is a classical technique to enhance data reuse in memory hierarchy levels close to the processor. Recent advances have made it possible to automatically generate parametrically tiled code, even for imperfectly nested loops [2,15,18,24]. It is well known that the choice of tile sizes has a significant effect on performance, but the effective selection of optimized tile sizes remains an open problem that has become ever more challenging as processor memory hierarchies increase in complexity and depth.…”
Section: Introductionmentioning
confidence: 99%