As an improved method of lattice Boltzmann method (LBM), regularized lattice Boltzmann method (RLBM) has been applied to simulate fluid flow. Nevertheless, the performance of RLBM needs to be considered when simulating actual problems. The rise of multicore platforms, especially the popularity of graphics processor units (GPUs), has provided possible implementation solutions for parallel computing. In this article, an RLBM parallel model on the CPU/GPU heterogeneous platforms is proposed. To solve the problem of possible GPU memory shortage, the CPU controls the startup of the kernel function in the RLBM algorithm and participates in the calculation. Due to the characteristics of the algorithm, the entire flow field is divided into CPU computing areas and GPU computing areas according to the given flow field division rules. OpenMP and CUDA are, respectively, applied to CPU and GPU for parallel computing. The startup of the kernel function takes a short time, and the CPU and GPU can be approximately regarded as performing calculations simultaneously. Since the data exchange between the CPU and GPU has a significant impact on performance, a buffer is set at the boundary of the CPU and GPU to reduce the frequency of data exchange. The buffer size determines the number of program iterations before exchanging data. The performance of the algorithm is measured by MFLUPS, and the algorithm is applied to the 3D lid‐driven cavity flow. The obtained results show that the MFLUPS of the algorithm is 25 times that of CPU MFLUPS, and it adds an increment equivalent to CPU MFLUPS compared with GPU MFLUPS. And the algorithm is extended to a multi‐GPU version, which is also applied to the 3D lid‐driven cavity flow. The results obtained show that the MFLUPS of the multi‐GPU version is 1.8 times that of the single GPU.