Graphical processing unit (GPU) computation in recent years has seen extensive growth due to advancement in both hardware and software stack. This has led to increase in the use of GPUs as accelerators across a broad spectrum of applications. This work deals with the use of general purpose GPUs for performing computational fluid dynamics (CFD) computations. The paper discusses strategies and findings on porting a large multifunctional CFD code to the GPU architecture. Within this framework, the most compute intensive segment of the software, the BiCGStab linear solver using additive Schwarz block preconditioners with point Jacobi iterative smoothing is optimized for the GPU platform using various techniques in CUDA Fortran. Representative turbulent channel and pipe flow are investigated for validation and benchmarking purposes. Both single and double precision calculations are highlighted. For a modest single block grid of 64 × 64 × 64, the turbulent channel flow computations showed a speedup of about eightfold in double precision and more than 13-fold for single precision on the NVIDIA Tesla GPU over a serial run on an Intel central processing unit (CPU). For the pipe flow consisting of 1.78 × 106 grid cells distributed over 36 mesh blocks, the gains were more modest at 4.5 and 6.5 for double and single precision, respectively.