The development of high-resolution video mounts a serious challenge to the previous video coding standard. The appearance of the new generation standards greatly relieves the dilemma but increases the coding complexity dramatically. Motion estimation is considered as the module with a relatively high computational complexity. In this paper, a parallel motion estimation implementation is proposed, which includes pre-motion estimation, integer motion estimation, and fractional motion estimation. They are highly accelerated on GPU based on AVS2, which is one of the new generation standards. A rapid mapping table algorithm is introduced to improve the efficiency of data access. In addition, a quasi-integral-graph algorithm is designed to calculate SAD or SATD efficiently for blocks of different sizes. The two novel techniques can effectively improve the utilization and efficiency of threads and exploit the characteristics of GPU. The experimental results show that the proposed parallel method can effectively accelerate the motion estimation.