Heterogeneous Computing With OpenCL 2012
DOI: 10.1016/b978-0-12-387766-6.00034-7
|View full text |Cite
|
Sign up to set email alerts
|

OpenCL Extensions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 0 publications
0
14
0
Order By: Relevance
“…We used 120 image frames, and the image frame sizes were 1440 Â 1080, 1280 Â 720, 800 Â 600, 720 Â 480 and 640 Â 480. In addition, we used OpenMP [8] and OpenCL [9] in order to parallelize feature extraction on the CPU and GPU. Note that, although OpenMP does not require the data copy time within a CPU, it does not provide the detail operations for asymmetric workload assignment between the host CPU core and the remaining CPU cores.…”
Section: Resultsmentioning
confidence: 99%
“…We used 120 image frames, and the image frame sizes were 1440 Â 1080, 1280 Â 720, 800 Â 600, 720 Â 480 and 640 Â 480. In addition, we used OpenMP [8] and OpenCL [9] in order to parallelize feature extraction on the CPU and GPU. Note that, although OpenMP does not require the data copy time within a CPU, it does not provide the detail operations for asymmetric workload assignment between the host CPU core and the remaining CPU cores.…”
Section: Resultsmentioning
confidence: 99%
“…This provides the programmer with enough flexibility to choose the best architecture for the given task, or to select the task that optimally exploits the given platform. However, this flexibility comes at the expense of an increased programming complexity [34].…”
Section: Openclmentioning
confidence: 99%
“…For more details on OpenCL, please refer to the specifications [35], the suggested bibliography [34], and available examples, such as [37,38].…”
Section: Main Opencl Conceptsmentioning
confidence: 99%
“…Such layers are meant (i) to encourage better partitioning of the problem towards fine-grained granularity and low communication, hence increasing the scalability to fully leverage a large number of CUs when available; and (ii) to potentially support more restricted compute architectures, by not strictly enforcing parallelism among CUs while still ensuring that the device is capable of doing synchronism, which can occur among PEs within each CU [15]. Figure 1 shows four scopes of memory, namely, global, constant, local, and private memories.…”
Section: Hardware Modelmentioning
confidence: 99%