Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Riakiotakis, I.; Ciorba, Florina M.; Andronikos, Theodore; Παπακωνσταντίνου, Γ.; Chronopoulos, Anthony T.

doi:10.1002/cpe.2812

Cited by 9 publications

(5 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Comparison of 3D and 2D Tiling. As mentioned in the related work, the proposed methods in [21,33,40] could find the near-optimal partitioning of 3-nested loop with dependencies for homogeneous/heterogeneous computing systems. It targets two loops of the nested loop and considers the outer loop as synchronization dimension and another loop as scheduling dimension.…”

Section: 2mentioning

confidence: 96%

“…In distributed-memory parallel systems, communication and synchronization overhead between the nodes are the important reasons of the performance degradation when running dependence loops. So, we use coarse-grain pipeline parallelism to balance trade-offs between parallelization, communication and synchronization overhead [20,33].…”

Section: Background and Related Workmentioning

confidence: 99%

“…Andronikos et al [21,33,40] claimed that the problem of finding the optimal partitioning of nested loops [21,33,40] on homogeneous and heterogeneous systems, respectively. The tiles with the same number can be executed simultaneously.…”

Section: Related Workmentioning

confidence: 99%

“…In this section, we propose an approach to 3D tiling and scheduling of three-level perfectly nested loops with dependencies on heterogeneous systems. In the paper, we use the notation in [21,33], indicated in Table 3.1. Algorithm 1 outlines the main steps of proposed method.…”

Section: Related Workmentioning

confidence: 99%

See 3 more Smart Citations

Tiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems

Zefreh¹,

Lotfi²,

Khanli³

et al. 2016

SCPE

View full text Add to dashboard Cite

Nested loops are one of the most time-consuming parts and the largest sources of parallelism in many scientific applications. In this paper, we address the problem of 3-dimensional tiling and scheduling of three-level perfectly nested loops with dependencies on heterogeneous systems. To exploit the parallelism, we tile and schedule nested loops with dependencies by awareness of computational power of the processing nodes and execute them in pipeline mode. The tile size plays an important role to improve the parallel execution time of nested loops. We develop and evaluate a theoretical model to estimate the parallel execution time of tilled nested loops. Also, we propose a tiling genetic algorithm that used the proposed model to find the nearoptimal tile size, minimizing the parallel execution time of dependence nested loops. We demonstrate the accuracy of theoretical model and effectiveness of the proposed tiling genetic algorithm by several experiments on heterogeneous systems. The 3D tiling reduces the parallel execution time by a factor of 1.2× to 2× over the 2D tiling, while parallelizing 3D heat equation as a benchmark.

show abstract

Section: 2mentioning

confidence: 96%

Section: Background and Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Tiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems

Zefreh¹,

Lotfi²,

Khanli³

et al. 2016

SCPE

View full text Add to dashboard Cite

show abstract

“…A two-phase scheme is proposed to solve parallel regular loop scheduling problem in heterogeneous grid computing environments in [16]. In [17][18][19] new results are presented for loops with dependencies. Recent research results [20,21] have been reported for designing loop self-scheduling methods for grids.…”

Section: Related Workmentioning

confidence: 99%

Scalable Loop Self-Scheduling Schemes for Large-Scale Clusters and Cloud Systems

Han

Chronopoulos

2016

Int J Parallel Prog

View full text Add to dashboard Cite

Cloud systems have demonstrated the powerful computation and storage capability in many scientific applications. In this paper, we propose a class of scalable distributed loop self-scheduling schemes to achieve good load balancing and scalability. We implemented these schemes on a large-scale cluster and on a heterogeneous cloud system. The schemes consider the distribution of the output data, which can help reduce communication overhead and improve scalability. We evaluated the schemes using four scientific computations: Mandelbrot set, adjoint convolution, matrix multiplication and quick sort. The results show that the new schemes achieve better load balancing, better scalability and better overall performance than standard distributed loop self-scheduling schemes.

show abstract

Special issue Editorial: New technologies of distributed systems

Drira

Kacem

Jmaïel

2016

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Cited by 9 publications

References 40 publications

Tiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems

Tiling and Scheduling of Three-level Perfectly Nested Loops with Dependencies on Heterogeneous Systems

Scalable Loop Self-Scheduling Schemes for Large-Scale Clusters and Cloud Systems

Special issue Editorial: New technologies of distributed systems

Contact Info

Product

Resources

About