In this paper, we propose a fully parallel truncated Viterbi decoder for Software-Defined Radio (SDR) on the Graphics Processing Unit (GPU) platform. We exploit a map-reduce strategy based on the three-point Viterbi decoding algorithm (TVDA) due to the high parallelization potential. The trellis of Viterbi decoding algorithm can be divided into sub-trellises in truncation, which can perform independent forward metrics computing and trace-back procedure in parallel. The parallel Viterbi decoding algorithm is mapped on a GPU named NVIDIA GTX580. The experiment shows that our method shows low BER and 36.0x speedup over a C implementation on a CPU with the frequency of 2.0GHz. At the meantime, our method achieves a performance improvement of 1.2x-3.6x times that of the existing GPU-based implementation.