2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines 2012
DOI: 10.1109/fccm.2012.40
|View full text |Cite
|
Sign up to set email alerts
|

Memory Bandwidth Efficient Two-Dimensional Fast Fourier Transform Algorithm and Implementation for Large Problem Sizes

Abstract: Abstract-Prevailing VLSI trends point to a growing gap between the scaling of on-chip processing throughput and off-chip memory bandwidth. An efficient use of memory bandwidth must become a first-class design consideration in order to fully utilize the processing capability of highly concurrent processing platforms like FPGAs. In this paper, we present key aspects of this challenge in developing FPGA-based implementations of two-dimensional fast Fourier transform (2D-FFT) where the large datasets must reside o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
37
0

Year Published

2013
2013
2017
2017

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 29 publications
(38 citation statements)
references
References 6 publications
1
37
0
Order By: Relevance
“…As explained in [8], given the tiled data layout, one can compute the 2D-FFT while avoiding strided accesses. The main idea is instead of transferring stripe of elements in row and column direction as shown in Figure 1, transfer "tiles" in row and column direction (see 1 and 2 in Figure 3(a)).…”
Section: Ffts Using Block Data Layoutsmentioning
confidence: 99%
See 1 more Smart Citation
“…As explained in [8], given the tiled data layout, one can compute the 2D-FFT while avoiding strided accesses. The main idea is instead of transferring stripe of elements in row and column direction as shown in Figure 1, transfer "tiles" in row and column direction (see 1 and 2 in Figure 3(a)).…”
Section: Ffts Using Block Data Layoutsmentioning
confidence: 99%
“…There have been many implementations of single and multidimensional FFTs on various platforms. These include software implementations on CPUs [3,4], GPUs [5], supercomputers [6,7], and hardware implementations [8,9,10]. These implementations either do not address the memory access pattern issue or provide a solution for a specific target platform and problem.…”
Section: Introductionmentioning
confidence: 99%
“…2D FFT. Next we focus on large size 2D-FFT which is a dense computation used in SAR imaging [23], [4]. Image sizes used in SAR image reconstruction are usually very large, hence requires large-size 2D-FFT computation.…”
Section: D Lim Accelerated Data Intensive Applicationsmentioning
confidence: 99%
“…For an efficient HW implementation of FFT we use Spiral [9] formula generation and optimization framework. Spiral features block data layout FFTs for large datasets of SAR images to address the DRAM bandwidth utilization [13], [14]. These DRAM-optimized FFT implementations make use of the tiled memory layout by mapping each tile to a DRAM row hence minimize the number of row buffer misses.…”
Section: D-ifft and System Integrationmentioning
confidence: 99%
“…The re-gridding and FFT units are implemented in the logic layer of a 3D-stacked DRAM similar to [15], [16]. The 2D-FFT requires double-buffered local memory that performs data permutations and a local FFT core that executes the FFT kernel [13], [16]. We also construct a double-buffered interpolation unit that streams the interpolated rectangular grid data into the 2D-FFT unit.…”
Section: D-ifft and System Integrationmentioning
confidence: 99%