2009
DOI: 10.1109/mc.2009.57
|View full text |Cite
|
Sign up to set email alerts
|

Refactoring for Data Locality

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
30
0
8

Year Published

2011
2011
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 60 publications
(38 citation statements)
references
References 7 publications
0
30
0
8
Order By: Relevance
“…Traditionally, loop transformation is performed by skilled programmers or tuners [5]. Even in a typical profile-driven Copyright c 2014 The Institute of Electronics, Information and Communication Engineers performance tuning process by programmers, detecting hot spots and understanding their loop structures without reading code carefully are known to be helpful for them [6]. Our precise results that show precise loop nest structures and their weights in the execution help make better strategies for loop transformation.…”
Section: Introductionmentioning
confidence: 88%
See 1 more Smart Citation
“…Traditionally, loop transformation is performed by skilled programmers or tuners [5]. Even in a typical profile-driven Copyright c 2014 The Institute of Electronics, Information and Communication Engineers performance tuning process by programmers, detecting hot spots and understanding their loop structures without reading code carefully are known to be helpful for them [6]. Our precise results that show precise loop nest structures and their weights in the execution help make better strategies for loop transformation.…”
Section: Introductionmentioning
confidence: 88%
“…There are three significant advantages of our loop nest detection system over the previous methods [6]- [8]. The first is that it enables precise detection of inter-procedural dynamic loop nests.…”
Section: Introductionmentioning
confidence: 99%
“…The SLO [7] reuse profiling tool is used to compare the original schedule of Listing 1, with the optimized schedules of Table I. The results are depicted in Figure 11, for the original iteration order the external communications are reduced when a new loop level fits in the buffer.…”
Section: Experimental Evaluationmentioning
confidence: 99%
“…We estimated the energy of data transfer for varying onchip memory size with a memory tracing tool [7], and did energy estimation for external [8] and on-chip [9] accesses. From the result, depicted in Figure 1, we conclude that increasing accelerator utilization with more external memory bandwidth is bad for energy.…”
Section: Introductionmentioning
confidence: 99%
“…For simplicity the result matrix C is initialized to 0, and the loop bounds are set to: Bi=500, Bj=400, Bk=300. We used Suggestions for Locality Optimizations (SLO) [17] to simulate the remaining data transfers. Remaining transfers are defined as the total number of memory accesses minus the reuses of data elements, as depicted for different buffer sizes in fig.…”
Section: Motivation: Scheduling For Data Localitymentioning
confidence: 99%