An Efficient Parallel SOVA-Based Turbo Decoder for Software Defined Radio on GPU

Li, Rongchun; Dou, Yong; Xu, Jiaqing; Niu, Xiamu; Ni, Shice

doi:10.1587/transfun.e97.a.1027

Cited by 5 publications

(5 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To create a socket, four parameters are required: its datatype (given as a template parameter), its associated task, its name, and its size. Finally, a "codelet" function need to be set (lines [14][15][16][17][18][19][20][21][22][23][24][25][26]. This codelet will be called when the task will be triggered.…”

Section: Elementary Componentsmentioning

confidence: 99%

“…Many SDR elementary blocks have been optimized for Intel® and ARM® CPUs. High throughput results have been achieved on GPUs; 19‐23 latency results are is still too high however to meet real time constraints and to compete with CPU implementations 22,24‐33 . This is mainly due to data transfers between the host (CPUs) and the device (GPUs), and to the nature of GPU designs, which are not optimized for latency efficiency.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A DSEL for high throughput and low latency software‐defined radio on multicore CPUs

Cassagne¹,

Tajan

Aumage

et al. 2023

Concurrency and Computation

View full text Add to dashboard Cite

SummaryThis article presents a new Domain Specific Embedded Language (DSEL) dedicated to Software‐Defined Radio (SDR). From a set of carefully designed components, it enables to build efficient software digital communication systems, able to take advantage of the parallelism of modern processor architectures, in a straightforward and safe manner for the programmer. In particular, proposed DSEL enables the combination of pipelining and sequence duplication techniques to extract both temporal and spatial parallelism from digital communication systems. We leverage the DSEL capabilities on a real use case: a fully digital transceiver for the widely used DVB‐S2 standard designed entirely in software. Through evaluation, we show how proposed software DVB‐S2 transceiver is able to get the most from modern, high‐end multicore CPU targets.

show abstract

Section: Elementary Componentsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A DSEL for high throughput and low latency software‐defined radio on multicore CPUs

Cassagne¹,

Tajan

Aumage

et al. 2023

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…In [16], [20], multiple trellisstate computations are performed in parallel in the SIMD units. In [9]- [18], [20], the decoded frame is split into sub-blocks that are processed in parallel in the SIMD units. An alternative approach is to process both SISO decoding in parallel but it requires additional computations for synchronization and/or impacts on error-correction performance [24].…”

Section: Parallelism Analysismentioning

confidence: 99%

“…In [9]- [18], turbo decoders were implemented on GPU targets to benefit from their computing power in order to comply with the LTE required throughputs. This was made possible by exploiting the parallelism within the turbo decoding process (intra-frame parallelism).…”

Section: Introductionmentioning

confidence: 99%

Beyond Gbps Turbo decoder on multi-core CPUs

Cassagne

Tonnellier

Leroux

et al. 2016

2016 9th International Symposium on Turbo Codes and Iterative Information Processing (ISTC)

View full text Add to dashboard Cite

This paper presents a high-throughput implementation of a portable software turbo decoder. The code is optimized for traditional multi-core CPUs (like x86) and it is based on the Enhanced max-log-MAP turbo decoding variant. The code follows the LTE-Advanced specification. The key of the high performance comes from an inter-frame SIMD strategy combined with a fixed-point representation. Our results show that proposed multi-core CPU implementation of turbo-decoders is a challenging alternative to GPU implementation in terms of throughput and energy efficiency. On a high-end processor, our software turbo-decoder exceeds 1 Gbps information throughput for all rate-1/3 LTE codes with K < 4096.

show abstract

“…In [2,3,4], GPUs are used to realize turbo decoding algorithm for their computing power and programmability. By fully utilizing the enormous parallelism of GPUs, many codeword blocks can be decoded simultaneously.…”

Section: Introductionmentioning

confidence: 99%

A multi-core-based heterogeneous parallel turbo decoder

Zeng

Zhang

et al. 2017

IEICE Electron. Express

View full text Add to dashboard Cite

It has always been a challenging task to implement a turbo decoder because it's typically the most compute-intensive and time-consuming part in a wireless communication system. This becomes especially obvious when realizing a turbo decoder through CPUs or GPUs. In this paper, we present a heterogeneous and highly reconfigurable parallel turbo decoder for LTE by employing a multi-core processor platform. A modified sliding-window algorithm is proposed to fully exploit the parallelism of turbo decoder, and a SIMD hardware module is designed for the multi-core processor to accelerate the decoding process. Synthesized result in a 65-nm CMOS process shows that the whole system can run at a maximum clock frequency of 830 MHz, and a decoding throughput of 135 Mbps is achieved for a codeword block length of 6144 at 6 iterations. In addition, the speed-up rate compared to an unaccelerated implementation through the same multicore platform is in the order of 800%.

show abstract

An Efficient Parallel SOVA-Based Turbo Decoder for Software Defined Radio on GPU

Cited by 5 publications

References 16 publications

A DSEL for high throughput and low latency software‐defined radio on multicore CPUs

A DSEL for high throughput and low latency software‐defined radio on multicore CPUs

Beyond Gbps Turbo decoder on multi-core CPUs

A multi-core-based heterogeneous parallel turbo decoder

Contact Info

Product

Resources

About