Accelerating high performance applications with CUDA and MPI

Karunadasa, N. P.; Ranasinghe, D. N.

doi:10.1109/iciinfs.2009.5429842

Cited by 36 publications

(20 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This because with the availability of the needed hardware testbed, the communication plugin component will evolve in order to support the Infiniband network because it is expected that the higher bandwidth allows remote GPU virtualization frameworks to experience communication performances similar to the PCIe on the path between the local GPGPU and the remote GPU resource [32], [30]. Due to the unavailability of real world applications fitting the available ARM cluster, GVirtuS has been tested using an ad hoc distributed memory matrix multiplication software [14] and accelerated CUDA kernels working on local or x86 remoted high-end GPU device [18].…”

Section: Discussionmentioning

confidence: 99%

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Montella

Giunta

Laccetti

et al. 2016

Int J Parallel Prog

View full text Add to dashboard Cite

The astonishing development of diverse and different hardware platforms is twofold: on one side, the challenge for the exascale performance for big data processing and management; on the other side, the mobile and embedded devices for data collection and human machine interaction. This drove to a highly hierarchical evolution of programming models. GVirtuS is the general virtualization system developed in 2009 and firstly introduced in 2010 enabling a completely transparent layer among GPUs and VMs. This paper shows the latest achievements and developments of GVirtuS, now supporting CUDA 6.5, memory management and scheduling. Thanks to the new and improved remoting capabilities, GVirtus now enables GPU sharing among physical and virtual machines based on x86 and ARM CPUs on local workstations, computing clusters and distributed cloud appliances.

show abstract

Section: Discussionmentioning

confidence: 99%

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Montella

Giunta

Laccetti

et al. 2016

Int J Parallel Prog

View full text Add to dashboard Cite

show abstract

“…Message Passing Interface (MPI) [2] has been the choice of high performance computing for more than a decade and it has proven its capability in delivering higher performance in parallel applications. CUDA and MPI use different programming approaches but both of them depend on the inherent parallelism of the application to be effective.…”

Section: Comparison Study Of Parallel Computing With Alu and Gpu (Cudmentioning

confidence: 99%

Heterogeneous Parallel Computing Using Cuda for Chemical Process

Sosutha¹,

Mohana²

2015

Procedia Computer Science

View full text Add to dashboard Cite

CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. Using CUDA, the GPUs can be used for general purpose processing which involves parallel computation. CUDA has been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more. Chemical processes need validation of their experimental data. It was found that Chemical process could become one such application where CUDA can be efficiently used. These validations of Chemical processes normally involve calculation of many coefficients. The chemical process that has been chosen for parallelizing is Heat Transfer process. This process involves calculation of coefficients for multiple iterations. As each of these iterations is independent of one another, CUDA was used to parallelize the calculation process. The execution time analysis shows that though CPU outperforms GPU when the numbers of iterations are less, when the number of iterations increase the GPU outperforms CPU greatly.

show abstract

“…Support standard C language programming, which could support other high-level languages like Fortran, Java and Python. Two kinds of programming interfaces provided by CUDA are devicelevel programming interface and language integrated programming interface [5] .…”

Section: Cuda Gpu Parallel Architecturementioning

confidence: 99%

A Video Deblurring Optimization Algorithm Based on Motion Detection

Zhang

Yuan

2013

Proceedings of 3rd International Conference on Multimedia Technology(ICMT-13)

View full text Add to dashboard Cite

Although the performance of image acquisition devices has been improved dramatically in recent years, especially in the resolution and clarity, defocusing and motion blur are still big problems. Upgrading the devices with the better hardware is one way to solve the problem, but the costs will usually increase disproportionately comparing with what we get. The appropriate image restoration algorithm could improve the clarity and the recognition rate of images significantly. However, the huge computation of those image restoration algorithms makes them unpractical. A new image restoration algorithm based on video target detection and an accelerating method using the Graphic Processing Units (GPU) parallel computing architecture are proposed, which makes it efficient enough to handle 720p high-definition (HD) video processing in real time and makes sure that only the interested blurred regions get restored and other parts of the image will not be impacted.

show abstract

Accelerating high performance applications with CUDA and MPI

Cited by 36 publications

References 2 publications

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Heterogeneous Parallel Computing Using Cuda for Chemical Process

A Video Deblurring Optimization Algorithm Based on Motion Detection

Contact Info

Product

Resources

About