2020
DOI: 10.1007/978-3-030-58144-2_19
|View full text |Cite
|
Sign up to set email alerts
|

Toward Supporting Multi-GPU Targets via Taskloop and User-Defined Schedules

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…Torres et al [19] propose extensions of OpenMP to distribute workload between multiple devices. Kale et al [20] propose extensions to OpenMP task constructs to schedule loop computations across multiple GPUs.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Torres et al [19] propose extensions of OpenMP to distribute workload between multiple devices. Kale et al [20] propose extensions to OpenMP task constructs to schedule loop computations across multiple GPUs.…”
Section: Related Workmentioning
confidence: 99%
“…Research works (Xu et al [14], Komoda et al [15], Yan et al [16], [17], Cho et al [18], Torres et al [19], Kale et al [20]) propose extensions to OpenMP and OpenACC to automate the complex process of distributing the computations and data of parallel loops between CPUs and accelerators. However, these works focus on the homogeneous distribution of loop iterations across multiple GPUs to achieve load balance.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…But the main differences arise from the introduction of an abstraction such as Compute Unit and also the lack of support for a distributed shared memory approach. More recent works have evaluated the OpenMP 5.2 specification, 27,28 and have indicated the lack for appropriate work-distribution schemes for hybrid executions, as well as the nonexistence of support to solve the entanglement between the work distribution and the data placement in a distributed shared-memory architecture.…”
Section: Related Workmentioning
confidence: 99%
“…Concerning OpenMP, several proposals have been done to address the usage of multi-devices. One of them shows how OpenMP can be useful to assign work to multiple GPUs on a node by collectively offloading tasks containing OpenMP target regions to the GPUs of a multi-GPU environment [16]. However, their implementation is explicitly performed using the current language features, and not directly implemented into the compiler infrastructure.…”
Section: Related Workmentioning
confidence: 99%