1997
DOI: 10.1093/comjnl/40.6.322
|View full text |Cite
|
Sign up to set email alerts
|

Unique Sets Oriented Parallelization of Loops with Non-uniform Dependences

Abstract: Although many methods exist for nested loop partitioning, most of them perform poorly when parallelizing loops with non-uniform dependences. This paper addresses the issue of automatic parallelization of loops with non-uniform dependences. Such loops are normally not parallelized by existing parallelizing compilers and transformations. Even when parallelized in rare instances, the performance is very poor. Our approach is based on the`convex hull' theory which has adequate information to handle non-uniform dep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

1999
1999
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 16 publications
0
11
0
Order By: Relevance
“…A number of alternatives have been proposed for the case of affine index expressions, e.g. uniformization oriented techniques [23,26,6,19,27] and dataflow oriented techniques [20,11]. In this paper, the loops with non-uniform dependences are parallelized using WHILE loops with irregular strides.…”
Section: Program Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…A number of alternatives have been proposed for the case of affine index expressions, e.g. uniformization oriented techniques [23,26,6,19,27] and dataflow oriented techniques [20,11]. In this paper, the loops with non-uniform dependences are parallelized using WHILE loops with irregular strides.…”
Section: Program Modelmentioning
confidence: 99%
“…Since det(T) = 3, the largest partition has at most 1 + log 3 ( N 2 1 + N 2 2 ) iterations by theorem 1. Example 2 Consider another non-uniform dependence example used by Ju et al [11]. The PDM partitioning can only find a parallelism of two in the innermost loop, thus the recurrence chain partitioning is applied using algorithm 1: a(2*i+3,j+1)=a(i+2*j+1,i+j+3) 5 ENDDOALL 6 ENDIF 7 DOALL j=(i+2)/2,min(i+3,-i+10) 8 a(2*i+3,j+1)=a(i+2*j+1,i+j+3) 9 ENDDOALL 10 DOALL j=max(-i+11,1),min(i+3,12) 11 a(2*i+3,j+1)=a(i+2*j+1,i+j+3) 12 ENDDOALL 13 DOALL j=(3*i+8)/2,12 14 a(2*i+3,j+1)=a(i+2* (2,6).…”
Section: Examplementioning
confidence: 99%
See 3 more Smart Citations