2011
DOI: 10.1109/tsp.2011.2168525
|View full text |Cite
|
Sign up to set email alerts
|

A Methodology for Speeding Up Fast Fourier Transform Focusing on Memory Architecture Utilization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
12
0
1

Year Published

2013
2013
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 16 publications
0
12
0
1
Order By: Relevance
“…A comparison with the above libraries would be unfair because they use the SIMD (Single Instruction Multiple Data) vector instructions (they support load/store and arithmetical instructions with 128/256-bit data); however, our future work includes the support of SIMD instructions. In [29] [30] [31] [32], we have developed algorithm specific methodologies (we used the SIMD instructions), which produce lower execution time, lower compilation time and lower number of data accesses, than ATLAS [29] [30], FFTW [30] and OpenCV [32]. A comparison between the proposed methodology and [29] [30], is made in Section 4.…”
Section: Related Workmentioning
confidence: 99%
“…A comparison with the above libraries would be unfair because they use the SIMD (Single Instruction Multiple Data) vector instructions (they support load/store and arithmetical instructions with 128/256-bit data); however, our future work includes the support of SIMD instructions. In [29] [30] [31] [32], we have developed algorithm specific methodologies (we used the SIMD instructions), which produce lower execution time, lower compilation time and lower number of data accesses, than ATLAS [29] [30], FFTW [30] and OpenCV [32]. A comparison between the proposed methodology and [29] [30], is made in Section 4.…”
Section: Related Workmentioning
confidence: 99%
“…Common VLSI implementation of FFT architectures can be classified into three categories: memory-based architectures [3,4,5,6], cache-based architectures [7,8,9] and pipelined architectures [10,11,12,13]. Memory-based architectures generally consist of processing units and memory blocks.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, the FFT performance is tightly dependent to the processor's architecture and its memory and cache structure. Kelefouras et al [24] propose a methodology to speed up the FFT algorithm depending on the memory hierarchy of processor's architecture. State-of-the-art FFT libraries such as FFTW [25][26][27] and UHFFT [28] maximize the performance by adapting to the hardware at run time, usually using a planner, searching over a large space of parameters in order to pick the best implementation.…”
Section: Introductionmentioning
confidence: 99%