On supernode transformation with minimized total running time

Hodzic, E.; Shang, Weijia

doi:10.1109/71.679213

Cited by 32 publications

(34 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In [24], Hodzic and Shang have presented a scheme for scheduling loops that have been transformed through a supernode transformation. Their approach is to minimize total execution time, as follows: Firstly, the optimal tiling matrix H is determined and then it is applied to the original Iteration Space.…”

Section: Non-overlapping Vs Overlapping Schedulementioning

confidence: 99%

“…, 1). Notice, also, that this scheduling vector is not the same with Hodzic's [24] scheduling vector, since we are now referring to groups, while Hodzic was scheduling tiles.…”

Section: J G N ) These Groups Are Executed In Neighboring Smp mentioning

confidence: 99%

“…For each schedule, we are interested in the overall minimum execution time achieved at an optimally selected tile height (see [22,24,43]). The experimental results, shown in Figures 18-19, illustrate that, in every case, non-blocking communication is preferable to blocking communication and hyperplane grouping is preferable to vertical grouping.…”

Section: Thread 0 Thread 1 Explanationmentioning

confidence: 99%

“…This problem will not concern our scheduling, but it will mean that the communication architecture is too slow to exploit all the computation power of the computing system. Then, it would be better not to use all the nodes available, as implied in [24]. If we add more CPUs inside each SMP node, we may again cut the initial iteration space into smaller tiles.…”

Section: Scalability Issuesmentioning

confidence: 99%

“…Hodzic and Shang [24] proposed a method to correlate optimal tile size and shape, based on overall completion time reduction. Their approach considers a straightforward time schedule, where each processor executes all tiles along a specific dimension, by interleaving computation and communication phases.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Hyperplane Grouping and Pipelined Schedules: How to Execute Tiled Loops Fast on Clusters of SMPs

et al. 2005

View full text Add to dashboard Cite

Abstract. This paper proposes a novel approach for the parallel execution of tiled Iteration Spaces onto a cluster of SMP PC nodes. Each SMP node has multiple CPUs and a single memory mapped PCI-SCI Network Interface Card. We apply a hyperplane-based grouping transformation to the tiled space, so as to group together independent neighboring tiles and assign them to the same SMP node. In this way, intranode (intragroup) communication is annihilated. Groups are atomically executed inside each node. Nodes exchange data between successive group computations. We schedule groups much more efficiently by exploiting the inherent overlapping between communication and computation phases among successive atomic group executions. The applied non-blocking schedule resembles a pipelined datapath, where group computation phases are overlapped with communication ones, instead of being interleaved with them. Our experimental results illustrate that the proposed method outperforms previous approaches involving blocking communication or conventional grouping schemes.

show abstract

Section: Non-overlapping Vs Overlapping Schedulementioning

confidence: 99%

“…, 1). Notice, also, that this scheduling vector is not the same with Hodzic's [24] scheduling vector, since we are now referring to groups, while Hodzic was scheduling tiles.…”

Section: J G N ) These Groups Are Executed In Neighboring Smp mentioning

confidence: 99%