Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores 2015
DOI: 10.1145/2712386.2712405
|View full text |Cite
|
Sign up to set email alerts
|

Supporting multiple accelerators in high-level programming models

Abstract: Computational accelerators, such as manycore NVIDIA GPUs, Intel Xeon Phi and FPGAs, are becoming common in workstations, servers and supercomputers for scientific and engineering applications. Efficiently exploiting the massive parallelism these accelerators provide requires the designs and implementations of productive programming models.In this paper, we explore support of multiple accelerators in high-level programming models. We design novel language extensions to OpenMP to support offloading data and comp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 17 publications
0
11
0
Order By: Relevance
“…The multi-GPU problem has been investigated in literature. For example, the approaches proposed in [21,20] suggest to extend OpenMP to support multiple accelerators in a seamless way. OpenACC has runtime functions to support the utilization of multiple GPU, however, lacks GPU to GPU data transfer, in single node [19] and multinode [9].…”
Section: State Of the Artmentioning
confidence: 99%
“…The multi-GPU problem has been investigated in literature. For example, the approaches proposed in [21,20] suggest to extend OpenMP to support multiple accelerators in a seamless way. OpenACC has runtime functions to support the utilization of multiple GPU, however, lacks GPU to GPU data transfer, in single node [19] and multinode [9].…”
Section: State Of the Artmentioning
confidence: 99%
“…Yan et al [22] extend the OpenMP [4] and OpenACC [20] pragmas to support scheduling entire kernel executions on different devices. By using annotations, Yan et al aim to modify existing code, thus reducing the barrier to uptake and integration with existing code bases.…”
Section: Scheduling Of Kernelsmentioning
confidence: 99%
“…The extensions by Yan et al [22] build on existing work to support specifying how data should be partitioned between different hardware devices during a computation:…”
Section: Data Partitioningmentioning
confidence: 99%
See 1 more Smart Citation
“…• A range of new methods to fairly compare the efficiency of server architectures (Section VI) and scale these architectures on demand to meet workload QoS requirements [6], [7]. NanoStreams advances the state of the art in micro-servers in several ways by: (a) adding application-specific but programmable hardware accelerators to micro-servers, as opposed to existing solutions that use elaborate hardware design flows and target a single algorithm [8]; (b) providing general purpose low latency networking to access accelerators in the datacentre, as opposed to custom fabrics [9]; (c) effectively integrating streaming and accelerator-aware programming models into domain specific software stacks, moving one step ahead of ongoing efforts to unify heterogeneous programming models [10]; (d) significantly improving server energy-efficiency of micro-servers via on demand and QoS-aware scale-out and acceleration [6], [7].…”
Section: Introductionmentioning
confidence: 99%