Machine virtualization and cloud computing environment have highlighted for last several years. This trend is based on the endeavor to enhance the utilization and reduce the ownership cost of machines. On the other hand, in aspect of high performance computing, graphics processing unit (GPU) has proved its capability for general purpose computing in many research areas. Evolving from traditional APIs such as the OpenGL and the Direct3D to program GPU as a graphics device, the CUDA of NIVDIA and the OpenCL provide more general programming environment for users. By supporting memory access model, interfaces to access GPUs directly and programming toolkits, users can perform parallel computation using the hundreds of GPU cores. In this paper, we propose a GPU virtualization mechanism to exploit GPU on virtualized cloud computing environment. Differently from the previous work which mostly reimplemented GPU programming APIs and virtual device drivers, our proposed mechanism uses the direct pass-through of PCI-E channel having GPU. The main limitation of previous approaches is virtualization overhead. Since they were focused on the sharing of GPU among virtual machines, they reimplemented GPU programming APIs at virtual machine monitor (VMM) level, and it incurred significant performance overhead. Moreover, if APIs are changed, they need to reengineer the most of APIs. In our approach, bypassing virtual machine monitor layer with negligible overhead, the mechanism can achieve similar computation performance to bare-metal system and is transparent to the GPU programming APIs.