Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widely used in computer systems. These GPUs provide substantially more computation capability and memory bandwidth compared to traditional multi-cores. Also, because they are highly programmable, they provide the computational performance needed for realistic graphics rendering. Applications with general computations can also be leveraged onto these GPUs. This study discusses the architectures of these highly efficient GPUs and applies a unified programming standard called OpenCL to fully utilize their capabilities. Despite their great potential, applications of these GPUs are challenging because of their diverse underlying architectural characteristics. In this study, several optimizing techniques are applied on OpenCL-compatible heterogeneous multicore architectures to achieve thread-level and data-level parallelisms. The architectural implications of these techniques are discussed. Finally, optimization principles for these architectures will be are proposed. The experimental reveal average speedups of 24 and 430 for non-optimized and optimized kernels, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.