The second-generation digital video broadcasting return channel via satellite (DVB-RCS2) is a promising real-time wireless protocol that has been widely used in many applications, such as video conferences, video feeds, and video multicasting. However, the receiver end of DVB-RCS2 is time consuming and should be accelerated by high-performance processing systems. Today, graphic processing units (GPUs) have been applied in communication systems due to high parallel capability and processing throughput. In this study, we design a novel pipeline of the receiver on the CPU-GPU platform. Moreover, we propose a CPU-GPU hybrid strategy to fully utilize resources and reduce communication latency. Compared with the parallel turbo decoder proposed in other work on the same platform, our parallel implementation achieves higher throughput. For the entire DVB-RCS2 receiver, compared with the non-pipelined serial and non-pipelined parallel algorithms, our proposed pipeline obtains 20 times and 6 times speedup, respectively.In addition, the latency of our implementation is lower than that of non-pipelined CPU-GPU implementation, which is equal to 1.06 ms.
KEYWORDSCPU-GPU hybrid platform, pipeline, real-time receiver, software defined radio
INTRODUCTIONOver the past decade, digital video broadcasting return channel via satellite (DVB-RCS) and its second generation, DVB-RCS2, have gained rapid prominence and support from governments and industries due to low-cost, high-performance, and reliable internet protocol (IP) networks through very small aperture terminals (VSATs). DVB-RCS2 specifies new technology that provides significant advances in time division multiple address (TDMA) performance for improved efficiency, increased throughput, and greater network reliability than DVB-RCS. DVB-RCS2 provides solutions for high-speed Internet access, video services, voice over IP, and global system for mobile communication backhaul used by the maritime industry, energy sector, utility companies, and global corporations. To the best of our knowledge, all services supported by DVB-RCS2 need low latency and high throughput and must provide real-time capability. Although the receive throughput of DVB-RCS2 is important, it cannot achieve high performance on a CPU platform. The graphic processing unit (GPU) provides a new platform to accelerate the wireless applications in software-defined radio. Compared to a CPU, GPU is a highly parallel, multi-threaded, multi-core processor with tremendous computational power and very large memory bandwidth. Thus, GPU is suitable in performing computation-intensive and highly parallel tasks.For convenience, we use the demodulator to represent all modules of the receiver except for the turbo decoder, as shown in Figure 2. By analyzing the running time of each part on the receiver, we find that the decoder is the most time-consuming part, so it has to be accelerated on the CPU-GPU platform. In fact, numerous studies have focused on how to accelerate the turbo decoder using the CPU-GPU platform. [1][2][3][4][5]...