2021
DOI: 10.1109/access.2021.3073955
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Inter-Device Task Scheduling Schemes for Multi-Device Co-Processing of Data-Parallel Kernels on Heterogeneous Systems

Abstract: Heterogeneous systems consisting of multiple multi-core CPUs and many-core accelerators have recently come into wide use, and more and more parallel applications are developed in such a heterogeneous system. To fully utilize multiple compute devices to cooperatively and concurrently execute data-parallel kernels on heterogeneous systems, a feedback-based dynamic and elastic task scheduling scheme is proposed, which can provide a better load balance, a greater device utilization, and a lower scheduling overhead… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 37 publications
0
6
0
Order By: Relevance
“…For the sake of brevity, the details of HCE runtime system and inter-device scheduling policies will not be described in this article, please refer to our previous work. 17,18 In short, under the help of HeteroPP, programmers only need to focus on how to write data-parallel compute kernels using the extended OpenMP directives and clauses, but do not need to care about the complicated implementation details of multi-device co-computing of data-parallel compute kernels.…”
Section: Overall Design Of Heteroppmentioning
confidence: 99%
See 3 more Smart Citations
“…For the sake of brevity, the details of HCE runtime system and inter-device scheduling policies will not be described in this article, please refer to our previous work. 17,18 In short, under the help of HeteroPP, programmers only need to focus on how to write data-parallel compute kernels using the extended OpenMP directives and clauses, but do not need to care about the complicated implementation details of multi-device co-computing of data-parallel compute kernels.…”
Section: Overall Design Of Heteroppmentioning
confidence: 99%
“…The runtime system is mainly composed of four components: device management, memory management, task scheduling, and transfer optimization, and each component contains a series of runtime application programming interfaces (APIs). The runtime system currently provides our previously proposed inter‐device scheduling policies 17,18 including FDETS (i.e., the feedback‐based dynamic and elastic task scheduling policy), MFDETS (i.e., the modified FDETS that supports incremental data transfer), ADETS (i.e., the asynchronous‐based dynamic and elastic task scheduling policy), and MADETS (i.e., the modified ADETS that supports three‐way overlapping communication optimization). For the sake of brevity, the details of HCE runtime system and inter‐device scheduling policies will not be described in this article, please refer to our previous work 17,18 …”
Section: Overall Design Of Heteroppmentioning
confidence: 99%
See 2 more Smart Citations
“…In SIMD computing, several values are processed simultaneously using a single instruction, contrasting with the typical structure of CPUs. Consequently, GPUs necessitate specific data structure and scheduling approaches to fully exploit their parallel capabilities [12,[18][19][20][21][22][23]. Owing to their relatively smaller memory than the Host, GPUs can only accommodate a subset of the complete graph as input.…”
Section: Introductionmentioning
confidence: 99%