Parameterized tiled loops-where the tile sizes are not fixed at compile time, but remain symbolic parameters until later-are quite useful for iterative compilers and "auto-tuners" that produce highly optimized libraries and codes. Tile size parameterization could also enable optimizations such as register tiling to become dynamic optimizations. Although it is easy to generate such loops for (hyper) rectangular iteration spaces tiled with (hyper) rectangular tiles, many important computations do not fall into this restricted domain. Parameterized tile code generation for the general case of convex iteration spaces being tiled by (hyper) rectangular tiles has in the past been solved with bounding box approaches or symbolic Fourier Motzkin approaches. However, both approaches have less than ideal code generation efficiency and resulting code quality. We present the theoretical foundations, implementation, and experimental validation of a simple, unified technique for generating parameterized tiled code. Our code generation efficiency is comparable to all existing code generation techniques including those for fixed tile sizes, and the resulting code is as efficient as, if not more than, all previous techniques. Thus the technique provides parameterized tiled loops for free! Our "one-size-fits-all" solution, which is available as open source software can be adapted for use in production compilers.
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. High-performance implementations use multiple levels of tiling to exploit the hierarchy of parallelism and cache/register locality. Efficient generation of multi-level tiled code is essential for effective use of multi-level tiling. Parameterized tiled code, where tile sizes are not fixed but left as symbolic parameters can enable several dynamic and run-time optimizations. Previous solutions to multi-level tiled loop generation are limited to the case where tile sizes are fixed at compile time. We present an algorithm that can generate multi-level parameterized tiled loops at the same cost as generating single-level tiled loops. The efficiency of our method is demonstrated on several benchmarks. We also present a method-useful in register tiling-for separating partial and full tiles at any arbitrary level of tiling. The code generator we have implemented is available as an open source tool.
Loop tiling is a widely used program optimization that improves data locality and enables coarse-grained parallelism. Parameterized tiled loops, where the tile sizes remain symbolic parameters until runtime, are quite useful for iterative compilers and autotuners that produce highly optimized libraries and codes. Although it is easy to generate such loops for (hyper-) rectangular iteration spaces tiled with (hyper-) rectangular tiles, many important computations do not fall into this restricted domain. In the past, parameterized tiled code generation for the general case of convex iteration spaces being tiled by (hyper-) rectangular tiles has been solved with bounding box approaches or with sophisticated and expensive machinery. We present a novel formulation of the parameterized tiled loop generation problem using a polyhedral set called the outset . By reducing the problem of parameterized tiled code generation to that of generating standard loops and simple postprocessing of these loops, the outset method achieves a code generation efficiency that is comparable to existing code generation techniques, including those for fixed tile sizes. We compare the performance of our technique with several other tiled loop generation methods on kernels from BLAS3 and scientific computations. The simplicity of our solution makes it well suited for use in production compilers—in particular, the IBM XL compiler uses the inset-based technique introduced in this article for register tiling. We also provide a complete coverage of parameterized tiling of perfect loop nests by describing three related techniques: (i) a scheme for separating full and partial tiles; (ii) a scheme for generating tiled loops directly from the abstract syntax tree representation of loops; (iii) a formal characterization of parameterized loop tiling using bilinear forms and a Symbolic Fourier-Motzkin Elimination (SFME)-based parameterized tiled loop generation method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.