2013 23rd International Conference Radioelektronika (RADIOELEKTRONIKA) 2013
DOI: 10.1109/radioelek.2013.6530933
|View full text |Cite
|
Sign up to set email alerts
|

Low level source code optimizing for single/multi/core digital signal processors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 2 publications
0
5
0
Order By: Relevance
“…For illustration, the achievable results are presented by comprehensible 4-point FFT radix-2 with time decimated complex input [23]. Thanks to the optimizations from [24], the 4-point version contains only addition and subtraction operations. The part of algorithm description without signal definitions is shown in Listing 1 and for better understanding, it can be visualized by the generated DOT file [25] (see Fig.…”
Section: Basic Behaviormentioning
confidence: 99%
“…For illustration, the achievable results are presented by comprehensible 4-point FFT radix-2 with time decimated complex input [23]. Thanks to the optimizations from [24], the 4-point version contains only addition and subtraction operations. The part of algorithm description without signal definitions is shown in Listing 1 and for better understanding, it can be visualized by the generated DOT file [25] (see Fig.…”
Section: Basic Behaviormentioning
confidence: 99%
“…The code for the floating-point data type does not do that, because the floating-point operations take more instruction cycles for its execution. For the comparison with the hand optimized code from [4], the hand optimized 4-point FFT with single precision complex input takes 24 instruction cycles and the average unit load is about 30%. The hand optimized 8-point FFT takes 42 instruction cycles and the unit load is about 55%.…”
Section: -Point Fftmentioning
confidence: 99%
“…There are also available frameworks with the increased efficiency [3] that can work with VLIW architectures as well. The difference between the hand optimized code and the compiled code from the high-level language can be still significant, when the signal processing algorithm is implemented on any VLIW processor [4].…”
Section: Introductionmentioning
confidence: 99%
“…Although the implementation of high efficiency, a little unreasonable program design would lead to a sharp decline in performance. Richard Prokesch [4] concludes that if runtime predictability is important, that manual parallelization should be used because of managing the worker cores is controlled by the developer.…”
Section: Openmpmentioning
confidence: 99%