2015
DOI: 10.1007/978-3-319-17473-0_2
|View full text |Cite
|
Sign up to set email alerts
|

Directive-Based Compilers for GPUs

Abstract: Abstract. General Purpose Graphics Computing Units can be effectively used for enhancing the performance of many contemporary scientific applications. However, programming GPUs using machine-specific notations like CUDA or OpenCL can be complex and time consuming. In addition, the resulting programs are typically fine-tuned for a particular target device. A promising alternative is to program in a conventional and machine-independent notation extended with directives and use compilers to generate GPU code auto… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…The XMP extensions are in charge of providing distributed arrays with a small subset of the array operations of H 2 TAs and without their tile‐level features. As for heterogeneity, the fact that XcalableACC relies on OpenACC reduces its portability compared to OpenCL, which we use as backend, and sometimes also the performance, as OpenACC has been found to often offer considerably less performance than manually optimized kernels . In addition, unlike H 2 TAs, OpenACC requires explicit annotations for data movements between each host and its device(s).…”
Section: Related Workmentioning
confidence: 99%
“…The XMP extensions are in charge of providing distributed arrays with a small subset of the array operations of H 2 TAs and without their tile‐level features. As for heterogeneity, the fact that XcalableACC relies on OpenACC reduces its portability compared to OpenCL, which we use as backend, and sometimes also the performance, as OpenACC has been found to often offer considerably less performance than manually optimized kernels . In addition, unlike H 2 TAs, OpenACC requires explicit annotations for data movements between each host and its device(s).…”
Section: Related Workmentioning
confidence: 99%
“…The emergence of such systems has led to a resurgence of interest in parallelizing compilers. OpenACC, for instance, has been a target of several different compilers, such as AccUll [Reyes et al 2012], ipmacc [Lashgar et al 2014], OpenARC [Lee and Vetter 2014], and pgcc [Ghike et al 2014]. Similarly, OpenMP 4.0 is already supported by several mainstream compilers, including gcc 4.9.0 (for C/C++), gcc 4.9.1 (for Fortran), icc 15.0 (C/C++/Fortran), and LLVM's Clang 3.7, which offers partial support to OpenMP 4.0 for C/C++.…”
Section: Related Workmentioning
confidence: 99%
“…However, as discussed in Xu et al, the lack of certain directives often makes the exploitation of multiple accelerators under this paradigm challenging for programmers. However, the main concern with this strategy is that compiler‐based approaches strongly depend on the quality of the compiler, often lacking a reasonable performance model and, worse, strongly underperforming with respect to other alternatives due to missing optimization opportunities …”
Section: Related Workmentioning
confidence: 99%