2019
DOI: 10.1504/ijhpcn.2019.097051
|View full text |Cite
|
Sign up to set email alerts
|

Performance evaluation of OpenMP's target construct on GPUs - exploring compiler optimisations

Abstract: OpenMP is a directive-based shared memory parallel programming model and has been widely used for many years. From OpenMP 4.0 onwards, GPU platforms are supported by extending OpenMP's high-level parallel abstractions with accelerator programming. This extension allows programmers to write GPU programs in standard C/C++ or Fortran languages, without exposing too many details of GPU architectures. However, such high-level programming models generally impose additional program optimizations on compilers and runt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…The analysis presented in the previous publications can be categorized into compiler optimization, runtime overheads, and data management challenges. Compiler optimizes compute kernels, achieving high performance with OpenMP target offload on CPU and GPU targets when using teams distribute parallel for constructs and avoiding the use of explicit schedules [24], [25]. Other compiler optimization research has been focussed on accelerating user code that exists between the target and parallel constructs [24], [26], [27].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The analysis presented in the previous publications can be categorized into compiler optimization, runtime overheads, and data management challenges. Compiler optimizes compute kernels, achieving high performance with OpenMP target offload on CPU and GPU targets when using teams distribute parallel for constructs and avoiding the use of explicit schedules [24], [25]. Other compiler optimization research has been focussed on accelerating user code that exists between the target and parallel constructs [24], [26], [27].…”
Section: Related Workmentioning
confidence: 99%
“…Compiler optimizes compute kernels, achieving high performance with OpenMP target offload on CPU and GPU targets when using teams distribute parallel for constructs and avoiding the use of explicit schedules [24], [25]. Other compiler optimization research has been focussed on accelerating user code that exists between the target and parallel constructs [24], [26], [27]. Detailed analysis of OpenMP 4.5 supported by different compilers show runtime overheads during the testing of different features [28].…”
Section: Related Workmentioning
confidence: 99%
“…It is possible to find information about the usage of OpenMP and GPU programming in the OpenMP specifications [8]. The following papers [9], [10], [11] explain the usage of GPU offloading pragmas. Problem of this application is in it's strong similarity to CUDA and OpenAcc because accelerator programming in OpenMP is also based on defining of compute kernels, parallelization of cycles.…”
Section: Openmpmentioning
confidence: 99%