2013
DOI: 10.1016/j.jpdc.2012.07.005
|View full text |Cite
|
Sign up to set email alerts
|

An investigation of the performance portability of OpenCL

Abstract: The version presented here is a working paper or pre-print that may be later published elsewhere. If a published version is known of, the above WRAP url will contain details on finding it.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0
1

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 50 publications
(24 citation statements)
references
References 14 publications
0
23
0
1
Order By: Relevance
“…The work presented by Pennycook et al 14 shows how performance can suffer when bad values are chosen for various parameters, which supports the claim that autotuning is needed. However, they concentrate on a Message Passing Interface (MPI)/OpenCL approach, whereas we are benchmarking using only OpenCL.…”
Section: Related Workmentioning
confidence: 61%
“…The work presented by Pennycook et al 14 shows how performance can suffer when bad values are chosen for various parameters, which supports the claim that autotuning is needed. However, they concentrate on a Message Passing Interface (MPI)/OpenCL approach, whereas we are benchmarking using only OpenCL.…”
Section: Related Workmentioning
confidence: 61%
“…A device-specific program, when executed directly on another device, typically cannot achieve good performance [20], [21]. With poor performance portability, the use of multiple heterogeneous devices will not be beneficial but cause performance penalties.…”
Section: Discussion and Future Workmentioning
confidence: 99%
“…The NPB [19,20] consists of five kernels including IS, EP, CG, MG, and FT and three pseudo applications including SP, BT, and LU. The OpenCL version of NPB was created by SNU [16].…”
Section: Evaluation Algorithmmentioning
confidence: 99%