2014
DOI: 10.1109/les.2014.2311317
|View full text |Cite
|
Sign up to set email alerts
|

A High Throughput Efficient Approach for Decoding LDPC Codes onto GPU Devices

Abstract: Low density parity check (LDPC) decoding process is known as compute intensive. This kind of digital communication applications was recently implemented onto graphic processing unit (GPU) devices for LDPC code performance estimation and/or for real-time measurements. Overall previous studies about LDPC decoding on GPU were based on the implementation of the flooding-based decoding algorithm that provides massive computation parallelism. More efficient layered schedules were proposed in literature because decod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 36 publications
(6 citation statements)
references
References 11 publications
0
6
0
Order By: Relevance
“…To create a socket, four parameters are required: its datatype (given as a template parameter), its associated task, its name, and its size. Finally, a "codelet" function need to be set (lines [14][15][16][17][18][19][20][21][22][23][24][25][26]. This codelet will be called when the task will be triggered.…”
Section: Elementary Componentsmentioning
confidence: 99%
See 1 more Smart Citation
“…To create a socket, four parameters are required: its datatype (given as a template parameter), its associated task, its name, and its size. Finally, a "codelet" function need to be set (lines [14][15][16][17][18][19][20][21][22][23][24][25][26]. This codelet will be called when the task will be triggered.…”
Section: Elementary Componentsmentioning
confidence: 99%
“…Many SDR elementary blocks have been optimized for Intel® and ARM® CPUs. High throughput results have been achieved on GPUs; 19‐23 latency results are is still too high however to meet real time constraints and to compete with CPU implementations 22,24‐33 . This is mainly due to data transfers between the host (CPUs) and the device (GPUs), and to the nature of GPU designs, which are not optimized for latency efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…GPU-based high-throughput LDPC decoders have been widely studied in the past years [21][22][23][24][25][26]. In [21], a high-throughput decoder based on layered scheduling was proposed.…”
Section: Introductionmentioning
confidence: 99%
“…GPU-based high-throughput LDPC decoders have been widely studied in the past years [21][22][23][24][25][26]. In [21], a high-throughput decoder based on layered scheduling was proposed. Some GPU-based optimizations were presented in [22] to obtain a high throughput, which reached a 1.27 Gbps peak throughput on a single GPU.…”
Section: Introductionmentioning
confidence: 99%
“…The irregular data access patterns featured in turbo and LDPC decoders make efficient use of Single-Instruction Multiple-Data (SIMD) extensions present on today's processors difficult. To overcome the difficulty of efficiently accessing memory while decoding one frame and still achieve a good throughput, software decoders resorting to inter-frame parallelism (decoding multiple independent frames at the same time) are often proposed [11]- [13]. Inter-frame parallelism comes at the cost of higher latency, as many frames have to be buffered before decoding can be started.…”
Section: Introductionmentioning
confidence: 99%