Proceedings of the 8th International Conference on Supercomputing - ICS '94 1994
DOI: 10.1145/181181.181261
|View full text |Cite
|
Sign up to set email alerts
|

Reducing data communication overhead for DOACROSS loop nests

Abstract: If the loop iterations of a loop nest cannot be partitioned into independent sets, the data communication for data dependences are inevitable in order to execute them on parallel machines. This kind of loop nests are referred to as Doacross loop nests. This paper is concerned with compiler algorithms for parallelizing Doacross loop nests for distributed-memory multicomputers. We present a method that combines loop tiling, chain-based scheduling and indirect message passing to generate e cient message-passing p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

1995
1995
2009
2009

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(12 citation statements)
references
References 8 publications
0
12
0
Order By: Relevance
“…In every parallel time step each process performs uninterrupted computation within a single tile and communicates with its n neighbors in order to exchange data. Note that, even if the dependencies of the problem lead to the need for data exchange with diagonal neighbors, one can apply indirect message passing techniques (discussed in [35]), in order to limit the neighboring processes to the n nondiagonal ones. If tc is the time to compute one iteration, ts is the communication startup latency, t t is the time to transmit a unit of data and k is the mapping dimension (i.e.…”
Section: Scheduling Mapping and Parallel Execution Timementioning
confidence: 99%
“…In every parallel time step each process performs uninterrupted computation within a single tile and communicates with its n neighbors in order to exchange data. Note that, even if the dependencies of the problem lead to the need for data exchange with diagonal neighbors, one can apply indirect message passing techniques (discussed in [35]), in order to limit the neighboring processes to the n nondiagonal ones. If tc is the time to compute one iteration, ts is the communication startup latency, t t is the time to transmit a unit of data and k is the mapping dimension (i.e.…”
Section: Scheduling Mapping and Parallel Execution Timementioning
confidence: 99%
“…Consequently, only neighboring processes need to communicate assuming reasonably coarse parallel granularities, taking into account that distributed memory architectures are addressed. According to the above, we only consider unitary process communication directions for our analysis, since all other non-unitary process dependencies can be satisfied according to indirect message passing techniques, such as the ones described in [19]. However, in order to preserve the communication pattern of the application, we consider a weight factor d i for each process dependence direction i, implying that if iteration j = (j 1 , .…”
Section: Algorithmic Modelmentioning
confidence: 99%
“…All message passing communication is performed outside of the parallel region (lines 5-9 and 20-23), while the multi-threading parallel computation occurs in lines [10][11][12][13][14][15][16][17][18][19]. Note that no explicit barrier is required for thread synchronization, as this effect is implicitly achieved by exiting the multi-threading parallel region.…”
Section: Fine-grain Hybrid Modelmentioning
confidence: 99%
“…i , 1 ≤ j ≤ m}, and implementing the indirect message passing techniques discussed in [16]. The goal of this paper is to determine a rectangular tiling transformation, that minimizes the communication volume of a typical, non-boundary process during the parallel execution of the tiled iteration space.…”
Section: Definition Of the Problemmentioning
confidence: 99%