2020
DOI: 10.1007/s11227-020-03257-3
|View full text |Cite
|
Sign up to set email alerts
|

Efficiency and productivity for decision making on low-power heterogeneous CPU+GPU SoCs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 16 publications
0
4
0
Order By: Relevance
“…For both CUDA and HIP, the first two operations may be fulfilled by the {cuda,hip}MemGetInfo and {cuda,hip}Malloc routines, the latter of which produces C-style pointers to device memory segments. As of the SYCL 2020 standard (as adopted by DPC++) [14], similar allocation semantics may be achieved using unified shared memory (USM) [62,63] via the routines cl::sycl::device::get info and cl::sycl::malloc device, respectively. Given the C-pointers to device memory segments from any of the above programming models, pool allocations for the preallocated memory may be implemented in a high-level device-agnostic language (e.g., C/C++) in a straight forward manner.…”
Section: Performance Portability By Modular Software Designmentioning
confidence: 99%
“…For both CUDA and HIP, the first two operations may be fulfilled by the {cuda,hip}MemGetInfo and {cuda,hip}Malloc routines, the latter of which produces C-style pointers to device memory segments. As of the SYCL 2020 standard (as adopted by DPC++) [14], similar allocation semantics may be achieved using unified shared memory (USM) [62,63] via the routines cl::sycl::device::get info and cl::sycl::malloc device, respectively. Given the C-pointers to device memory segments from any of the above programming models, pool allocations for the preallocated memory may be implemented in a high-level device-agnostic language (e.g., C/C++) in a straight forward manner.…”
Section: Performance Portability By Modular Software Designmentioning
confidence: 99%
“…As far as we know, the only work that addresses co-execution with oneAPI is [37]. The authors extended the Intel TBB parallel_for function to allow simultaneous execution of the same kernel on CPU and GPU.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, oneAPI is becoming very popular. It has shown promising results in various computer fields, such as machine learning (Goli et al, 2020) or decision-making (Constantinescu et al, 2020). One of the keys to their growing impact on the heterogeneous field is the existence of optimized libraries that can be used together with oneAPI.…”
Section: Introductionmentioning
confidence: 99%