Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing
DOI: 10.1109/icpp.1996.538554
|View full text |Cite
|
Sign up to set email alerts
|

Polynomial-time nested loop fusion with full parallelism

Abstract: Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loopfusion is an effective way for reducing synchronization and improving data locality. Traditionalfusion techniques, however, either can not address the case when fusion-preventing dependences exist in nested loops, or can nor achieve good parallelism ajierfusion. This paper gives a significant improvement by presenting several eflicient polynomial-time algorithms to solve thes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(13 citation statements)
references
References 9 publications
0
13
0
Order By: Relevance
“…12a. EX2, EX3, and EX4 refer to the DSP applications presented in [17] that have several loops, including WDF (Wave Digital filter), IIR (Infinite Impulse Response filter), DPCM (Differential Pulse-Code Modulation device), and 2D (Two Dimensional filter). EX5 and EX6 refer to the LDGs presented in [21].…”
Section: Simulation Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…12a. EX2, EX3, and EX4 refer to the DSP applications presented in [17] that have several loops, including WDF (Wave Digital filter), IIR (Infinite Impulse Response filter), DPCM (Differential Pulse-Code Modulation device), and 2D (Two Dimensional filter). EX5 and EX6 refer to the LDGs presented in [21].…”
Section: Simulation Resultsmentioning
confidence: 99%
“…People have found that various loop transformation techniques including loop fusion, loop distribution and loop interchanging could improve data locality and reduce the number of memory accesses so as to improve timing performance and reduce the energy consumption of data dominated applications [2,3,[13][14][15][16][17][18][19][20][21].…”
Section: Introductionmentioning
confidence: 99%
“…LDG1, LDG2, and LDG3 refer to the examples presented in Figure 2, 8, and 17 in [6]. LDG4 and LDG5 refer to the examples shown in Figure 2(a) and Figure 6(a) in [4].…”
Section: Methodsmentioning
confidence: 99%
“…Several valuable works have been proposed for loop fusion recently [19,9,17]. Loop shifting tries to compute loop index offsets for loop fusion.…”
Section: Introductionmentioning
confidence: 99%