1999
DOI: 10.1007/978-3-540-49051-7_12
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Compiler Tiling Algorithms

Abstract: Abstract. Linear algebra codes contain data locality which can be exploited by tiling multiple loop nests. Several approaches to tiling have been suggested for avoiding conflict misses in low associativity caches. We propose a new technique based on intra-variable padding and compare its performance with existing techniques. Results show padding improves performance of matrix multiply by over 100% in some cases over a range of matrix sizes. Comparing the efficacy of different tiling algorithms, we discover rec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
73
0

Year Published

2000
2000
2007
2007

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 45 publications
(73 citation statements)
references
References 30 publications
0
73
0
Order By: Relevance
“…After extensive experimentation we found that Chame and Moon's algorithm tends to perform consistently better than other heuristics [3,4,12] when the tile size is reduced to cope with cache sharing. We note however that it is beyond the scope of this paper to investigate thoroughly the relative performance of tiling algorithms on SMT processors.…”
Section: Dynamic Tiling Copying and Block Data Layoutmentioning
confidence: 95%
See 1 more Smart Citation
“…After extensive experimentation we found that Chame and Moon's algorithm tends to perform consistently better than other heuristics [3,4,12] when the tile size is reduced to cope with cache sharing. We note however that it is beyond the scope of this paper to investigate thoroughly the relative performance of tiling algorithms on SMT processors.…”
Section: Dynamic Tiling Copying and Block Data Layoutmentioning
confidence: 95%
“…It is used extensively in scientific libraries such as LAPACK. It is also the target of numerous compiler optimizations for memory hierarchies [3,4,6,9,12,17].…”
Section: Introductionmentioning
confidence: 99%
“…Conflict misses [20] may occur when too many data items map to the same set of cache locations, causing cache lines to be flushed from cache before they may be used, despite sufficient capacity in the overall cache. As a result, in addition to eliminating capacity misses [11], [23] and maximizing cache utilization, the tile should be selected in such a way that there are no (or few) self conflict misses, while cross conflict misses are minimized [3], [4], [5], [10], [17].…”
Section: Related Workmentioning
confidence: 99%
“…Unfortunately, the performance of a tiled program resulting from existing tiling heuristics does not have robust performance [13], [17]. Instability comes from the so-called pathological array sizes, when array dimensions are near powers of two, since cache interference is a particular risk at that point.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation