2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2017
DOI: 10.1109/sbac-pad.2017.8
|View full text |Cite
|
Sign up to set email alerts
|

Extending OmpSs for OpenCL Kernel Co-Execution in Heterogeneous Systems

Abstract: Abstract-Heterogeneous systems have a very high potential performance but present difficulties in their programming. OmpSs is a well known framework for task based parallel applications, which is an interesting tool to simplify the programming of these systems. However, it does not support the co-execution of a single OpenCL kernel instance on several compute devices. To overcome this limitation, this paper presents an extension of the OmpSs framework that solves two main objectives: the automatic division of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 30 publications
(21 reference statements)
0
1
0
Order By: Relevance
“…Perez, Stafford, Beivide, Mateo, Turel, Ayguadé and Martorell propose in Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems [7] a novel extension to the OmpSs programming model to allow the co-execution of a single OpenCL kernel in several devices, including the Auto-Tune algorithm that provides adaptive load balancing strategies. Experimental results reveal that the co-execution of single kernels on all the devices in the node is beneficial in terms of performance and energy consumption, and that the proposed scheduling algorithm gives the best overall results.…”
Section: Special Issue Contentsmentioning
confidence: 99%
“…Perez, Stafford, Beivide, Mateo, Turel, Ayguadé and Martorell propose in Auto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systems [7] a novel extension to the OmpSs programming model to allow the co-execution of a single OpenCL kernel in several devices, including the Auto-Tune algorithm that provides adaptive load balancing strategies. Experimental results reveal that the co-execution of single kernels on all the devices in the node is beneficial in terms of performance and energy consumption, and that the proposed scheduling algorithm gives the best overall results.…”
Section: Special Issue Contentsmentioning
confidence: 99%