Abstract-This paper presents a comparison of OpenMP andOpenCL based on the parallel implementation of algorithms from various fields of computer applications. The focus of our study is on the performance of benchmark comparing OpenMP and OpenCL. We observed that OpenCL programming model is a good option for mapping threads on different processing cores. Balancing all available cores and allocating sufficient amount of work among all computing units, can lead to improved performance. In our simulation, we used Fedora operating system; a system with Intel Xeon Dual core processor having thread count 24 coupled with NVIDIA Quadro FX 3800 as graphical processing unit.
Graphics Processing Units (GPUs) is currently a common feature of high performance computing. Languages such as CUDA and Open Computing Language (OpenCL) are such programming models; provide a standard interface for achieving high performance across these GPU devices. However, because of the wide variety of architectural complexities of these GPU devices; often makes difficult to write programs for these platforms. One of the approaches to get rid off this difficulty is to parallelize sequential programs into equivalent parallel programs. In this paper, we present a methodology for parallelization of sequential C-programs with function calls to equivalent OpenCL programs with little assistance from programmer. Our proposed methodology identifies function calls and converts them into 'kernel' to be executed in parallel on GPU devices. To the best of our knowledge, there are no tools dedicated to conversion of C code to equivalent OpenCL code.
Graphics Processing Units (GPUs) are being heavily used in various graphics and non-graphics applications. Many practical problems in computing can be represented as graphs to arrive at a particular solution. These graphs contains very large number, up to millions pairs of vertices and edges. In this paper, we present performance analysis of Dijkstra's single source shortest path algorithm over multiple GPU devices in a single machine as well as over a network of workstations using OpenCL and MPI. Experimental results prove that parallel execution of Dijkstra's algorithm has good performance when algorithm is run over multi-GPU devices in a single workstation as opposed to multi-GPU devices over a network of workstations. For our experimentation, we have used workstation having Intel Xeon 6-core Processor; supporting hyper-threading and a total of 24 threads with NVIDIA Quadro FX 3800 GPU device. The two GPU devices are connected by SLI Bridge. Overall, on average we achieved performance improvement up to an order of 10-15x.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.