An Integrated Approach for Processor Allocation and Scheduling of Mixed-Parallel Applications

Vydyanathan, Nagavijayalakshmi; Krishnamoorthy, Srikumar; Sabin, Gerald; Çatalyürek, Ümit V.; Kurç, Tahsin; Sadayappan, P.; Saltz, Joel

doi:10.1109/icpp.2006.22

Cited by 21 publications

(28 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several practical PTG scheduling algorithms based on heuristics have been proposed in the literature [6], [8], [9], [10], [11]. Like the guaranteed algorithms discussed earlier, the algorithms in [6], [8], [9], [10] proceed in two phases.…”

Section: Single Homogeneous Clustermentioning

confidence: 99%

“…These last three algorithms all use a list-scheduling-based task mapping phase by which tasks are mapped to processors in order of decreasing "bottom level" (i.e., distance to the PTG's exit task), accounting for data communication and data redistribution costs. The iCASLB one-step algorithm in [11] was shown to lead to better performance than some two-step algorithms, including CPA, while maintaining reasonable complexity. This algorithm performs allocation and mapping simultaneously by iteratively increasing the allocations of tasks on the critical path, with a look-ahead mechanism to avoid being trapped in local minima, and a backfilling approach to improve the schedule.…”

Section: Single Homogeneous Clustermentioning

confidence: 99%

“…From a theoretical standpoint, although the scheduling problem is NP-complete, algorithms with performance guarantees, defined as the maximum ratio between the produced makespan and the optimal makespan, have been developed in [2], [3], [4], [5]. From a more applied standpoint, many nonguaranteed heuristics have been proposed and shown to lead to good average performance in practice [6], [7], [8], [9], [10], [11].…”

Section: -F Dutot Is With Univ Pierre Mendès-france Grenoble 2 / Lmentioning

confidence: 99%

See 2 more Smart Citations

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multicluster Platforms

Dutot

N'Takpé

Suter

et al. 2009

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Applications structured as parallel task graphs exhibit both data and task parallelism, and arise in many domains. Scheduling these applications efficiently on parallel platforms has been a long-standing challenge. In the case of a single homogeneous platform, such as a cluster, results have been obtained both in theory, i.e., guaranteed algorithms, and in practice, i.e., pragmatic heuristics. Due to task parallelism these applications are well suited for execution on distributed platforms that span multiple clusters possibly in multiple institutions. However, the only available results in this context are non-guaranteed heuristics. In this paper we develop a scheduling algorithm, MCGAS, which is applicable to multi-cluster platforms that are almost homogeneous. Such platforms are often found as large subsets of multi-cluster platforms. Our novel contribution is that MCGAS computes task allocations so that a (tunable) performance guarantee is provided. Since a performance guarantee does not necessarily imply good average performance in practice, we also compare MCGAS with a recently proposed non-guaranteed algorithm. Using simulation over a wide range of experimental scenarios, we find that MCGAS leads to better average application makespans than its competitor.Index Terms-ixed parallelism, parallel task graph scheduling, performance guarantee, multi-cluster platform ixed parallelism, parallel task graph scheduling, performance guarantee, multi-cluster platform M ✦

show abstract

Section: Single Homogeneous Clustermentioning

confidence: 99%

Section: Single Homogeneous Clustermentioning

confidence: 99%

Section: -F Dutot Is With Univ Pierre Mendès-france Grenoble 2 / Lmentioning

confidence: 99%

See 1 more Smart Citation

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multicluster Platforms

Dutot

N'Takpé

Suter

et al. 2009

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

show abstract

“…Then overlap j1, j2 is extracted as shown in Table 1. Since Tile 1 and Tile 5 are overlapped, overlap 1,5 and overlap 5,1 take 1. On the other hand, overlap 1,6 and overlap 6,1 take 0 because Tile 1 and Tile 6 do not share any cores.…”

Section: A Greedy Algorithmmentioning

confidence: 99%

“…In other words, an application is assigned a single core. Techniques presented in [2]- [5] take into account data parallelism within applications (intra-application parallelism) as well as application-level parallelism (interapplication parallelism). Their methods perform scheduling and mapping simultaneously, aiming at minimization of schedule length or pipeline throughput.…”

Section: Related Workmentioning

confidence: 99%

Static Mapping with Dynamic Switching of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

Taniguchi

Kaida

Hieda

et al. 2014

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThis paper studies mapping techniques of multiple applications on embedded many-core SoCs. The mapping techniques proposed in this paper are static which means the mapping is decided at design time. The mapping techniques take into account both inter-application and intraapplication parallelism in order to fully utilize the potential parallelism of the many-core architecture. Additionally, the proposed static mapping supports dynamic application switching, which means the applications mapped onto the same cores are switched to each other at runtime. Two approaches are proposed for static mapping: one approach is based on integer linear programming and the other is based on a greedy algorithm. Experimental results show the effectiveness of the proposed techniques.

show abstract

Optimizing layer‐based scheduling algorithms for parallel tasks with dependencies

Kunis

Rünger

2010

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYProgramming with parallel tasks leads to task graphs with dependencies representing a parallel program. Scheduling algorithms are employed to find an efficient execution order of the parallel tasks. A large variety of scheduling algorithms exist, including layer-based scheduling algorithms for homogeneous target platforms that build consecutive layers of independent parallel tasks and schedule each layer separately. Although these scheduling algorithms provide good results in terms of scheduling algorithm runtime and schedule execution time, the resulting schedules leave room for optimization. This article proposes an optimization for arbitrary layer-based scheduling algorithms, which is called Move-blocks algorithm. Given a layer-based schedule of the parallel tasks, this algorithm moves blocks of parallel tasks into preceding layers in order to reduce the overall execution time of a task-based application. Suitable blocks of parallel tasks are identified by the algorithm Find-blocks, which is employed together with the Move-blocks algorithm. The algorithm Move-blocks is applied to four well-known scheduling algorithms. A detailed evaluation for a wide range of test cases is given.

show abstract

An Integrated Approach for Processor Allocation and Scheduling of Mixed-Parallel Applications

Cited by 21 publications

References 19 publications

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multicluster Platforms

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multicluster Platforms

Static Mapping with Dynamic Switching of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

Optimizing layer‐based scheduling algorithms for parallel tasks with dependencies

Contact Info

Product

Resources

About