Performance Traps in OpenCL for CPUs

Shen, Jie; Fang, Jianbin; Sips, Henk; Vărbănescu, Ana Lucia

doi:10.1109/pdp.2013.16

Cited by 14 publications

(1 citation statement)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…OpenCL programs are compiled just in time for execution and can be used together with Mi-AccLib or other run-time libraries. These works [16][17][18] experienced a performance penalty on the NVIDIA GPU, due to the OpenCL abstraction layer. Thus, we have disabled OpenCL support as it is not optimized for GPUs at the moment, and real gains on GPUs can only be seen through optimized code as there are additional overheads from data movement.…”

Section: Related Workmentioning

confidence: 99%

A Middleware Framework for Programmable Multi-GPU-Based Big Data Applications

Karuppiah

Kok

Singh

2014

GPU Computing and Applications

View full text Add to dashboard Cite

Current application of GPU processors for parallel computing tasks shows excellent results in terms of speedups compared to CPU processors. However, there is no existing middleware framework that enables automatic distribution of data and processing across heterogeneous computing resources for structured and unstructured Big Data applications. Thus, we propose a middleware framework for "Big Data" analytics that provides mechanisms for automatic data segmentation, distribution, execution, information retrieval across multiple cards (CPU and GPU) and machines, a modular design for easy addition of new GPU kernels at both analytic and processing layer, and information presentation. The architecture and components of the framework such as multi-card data distribution and execution, data structures for efficient memory access, algorithms for parallel GPU computation, and results for various test configurations are shown. Our results show proposed middleware framework, providing alternative and cheaper HPC solution to users. Data cleansing algorithms on GPU show a speedup of over two orders of magnitude compared to the same operation done in MySQL on a multi-core machine. Our framework is also capable of processing more than 120 million of health data within 11 s. IntroductionNVIDIA CUDA-enabled GPGPU (general purpose graphic processing unit) has made its name by being part of world super computers to enable high-performance computation. Thus, GPGPUs are widely accepted and becoming common for many high-performance computing applications. GPGPUs are used for both specific and general purpose applications either running in large-scale system or desktop PCs.

show abstract

Section: Related Workmentioning

confidence: 99%