Although many methods exist for nested loop partitioning, most of them perform poorly when parallelizing loops with non-uniform dependences. This paper addresses the issue of automatic parallelization of loops with non-uniform dependences. Such loops are normally not parallelized by existing parallelizing compilers and transformations. Even when parallelized in rare instances, the performance is very poor. Our approach is based on the`convex hull' theory which has adequate information to handle non-uniform dependences. We introduce the concept of`complete dependence convex hull',`unique head and tail sets' and abstract the dependence information into these sets. These sets form the basis of the iteration space partitions. The properties of the unique head and tail sets are derived. Depending on the relative placement of these unique sets, partitioning schemes are suggested for implementation of our technique. Implementation results of our scheme on the Cray J916 and comparison with other schemes show the superiority of our technique.
In this paper we address the problem of partitioning nested loops with non-uniform (irregular) dependence vectors. Parallelizing and partitioning of nested loops requires efficient inter-iteration dependence analysis. Although many methods exist for nested loop partitioning, most of these perform poorly when parallelizing nested loops with irregular dependences. Unlike the case of nested loops with uniform dependences these will have a complicated dependence pattern which forms a non-uniform dependence vector set. We apply the results of classical convex theory and principles of linear programming to iteration spaces and show the correspondence between minimum dependence distance computation and iteration space tiling. Cross-iteration dependences are analyzed by forming an Integer DependenceConvex Hull (IDCH). Every integer point in this IDCH corresponds to a dependence vector in the iteration space of the nested loops. A simple way to compute minimum dependence distances from the dependence distance vectors of the extreme points of the IDCH is presented. Using these minimum dependence distances the iteration space can be tiled. Iterations within a tile can be executed in parallel and the different tiles can then be executed with proper synchronization. We demonstrate that our technique gives much better speedup and extracts more parallelism than the existing techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.