2011
DOI: 10.1007/978-3-642-19595-2_15
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 28 publications
(21 citation statements)
references
References 11 publications
0
21
0
Order By: Relevance
“…Ferrer et al [17] proposed OmpSs, a programming model based on OpenMP and StarSs, which can also incorporate the use of OpenCL or CUDA kernels. They evaluated their model with four benchmarks on three different types of hardware platforms (Intel Xeon Server, Cell/B.E., Nvidia GPUs), and compared the results obtained with the execution of the same benchmarks written in OpenCL.…”
Section: Related Workmentioning
confidence: 99%
“…Ferrer et al [17] proposed OmpSs, a programming model based on OpenMP and StarSs, which can also incorporate the use of OpenCL or CUDA kernels. They evaluated their model with four benchmarks on three different types of hardware platforms (Intel Xeon Server, Cell/B.E., Nvidia GPUs), and compared the results obtained with the execution of the same benchmarks written in OpenCL.…”
Section: Related Workmentioning
confidence: 99%
“…In the case of GPGPUs those (low-level) libraries include Brook [13], NVidia CUDA, and OpenCL. At a higher-level, Offload [14] enables offloading of parts of a C++ application, which are wrapped in offload blocks, onto hardware accelerators for asynchronous execution; OMPSs [15] enables the offloading of OpenCL and CUDA kernels as an OpenMP extension [16]. FastFlow, in contrast with these frameworks, does not target specific (hardware) accelerators but realizes a virtual accelerator running on the main CPUs and thus does not require the development of specific code.…”
Section: Related Workmentioning
confidence: 99%
“…We think that gathering all such information will easily excess the capabilities of the analysis tool for large runs and we would like to tackle this issue by rethinking which of the generated events are really required and which could be optional. Additionally, we are also working on a version of Nanos++ for distributed-memory systems [9] and accelerators [12]. We plan to extend the instrumentation mechanism in order to support these versions of the runtime.…”
Section: Discussionmentioning
confidence: 99%