Goal of this paper is to develop a fully functional parallel Computational Fluid Dynamics (CFD) code that is optimized to run on a single Graphics Processing Unit (GPU). This is achieved by writing the code in FORTRAN and OpenACC (Open Accelerators), providing them with an easily portable, platform independent code. Existing CFD code is significantly modified to allow for parallel asynchronous execution. Also, due to strong recursive dependencies in Tridiagonal Matrix Algorithm (TDMA) solver, it is replaced with Jacobi, which provides fast execution in environments with large number of parallel cores. In this research a computer code for simulation of 2D flow of water through the axisymmetric channel is used as a base for development. The parallel code is executed on GPU, single, and multicore Central Processing Unit (CPU), and the execution times are compared between platforms. Even though that Jacobi solver performs worse on single core computers, compared to its Gauss‐seidel counterpart, it is used to provide a baseline for comparison. In this work, it is shown that computation on finer grids takes less time on GPU than on CPU. The computation time increase with the number of cells in grid on GPU should follow the observed linear trend until the GPUs physical limitations are reached depending on memory size and core count.