2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6853936
|View full text |Cite
|
Sign up to set email alerts
|

Flexible non-binary LDPC decoding on FPGAs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 15 publications
0
6
0
Order By: Relevance
“…Andrade et al used OpenCL in [91] to provide a parallel FFT-SPA decoder architecture suited to the characteristics of a wide-pipeline accelerator, reaching 3.36 Mbps. Later, for the same algorithm, the authors showed that using high-level synthesis (HLS) can reduce the development effort compared to register transfer level (RTL), while they were still able to achieve 14.54 Mbps for 14 cores.…”
Section: B Fpgamentioning
confidence: 99%
See 3 more Smart Citations
“…Andrade et al used OpenCL in [91] to provide a parallel FFT-SPA decoder architecture suited to the characteristics of a wide-pipeline accelerator, reaching 3.36 Mbps. Later, for the same algorithm, the authors showed that using high-level synthesis (HLS) can reduce the development effort compared to register transfer level (RTL), while they were still able to achieve 14.54 Mbps for 14 cores.…”
Section: B Fpgamentioning
confidence: 99%
“…The implementations missing SNR and the number of iterations [63], [91], [93], [96] do not provide a meaningful contribution in terms of error-correction performance and, alternatively, they focus on resource requirement. In particular, the first decoder implementations [63], [93], [96] pioneered in the analysis of memory requirements for NB-LDPC decoders.…”
Section: ) Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…The LDPC decoder architecture, based on the FFT-SPA [3], is structured into different kernels composed of double-to triple-nested loop body functions. The optimizations carried out consist of (1) moving all intermediate results from DRAM computation to BRAMdefined arrays; (2) array partitioning to expose more R/W ports and increase bandwidth; (3) unrolling of the innermost loops; and (4) pipelining of the outermost loops.…”
Section: Hls-generated Architecturementioning
confidence: 99%