ACM/IEEE SC 2006 Conference (SC'06) 2006
DOI: 10.1109/sc.2006.13
|View full text |Cite
|
Sign up to set email alerts
|

Architectures and APIs: Assessing Requirements for Delivering FPGA Performance to Applications

Abstract: Reconfigurable computing leveraging field programmable gate arrays (FPGAs) is one of many accelerator technologies that are being investigated for application to high performance computing (HPC). Like most accelerators, FPGAs are very efficient at both dense matrix multiplication and FFT computations, but two important aspects of how to deliver that performance to applications have received too little attention. First, the standard API for important compute kernels hides parallelism from the system. Second, th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
8
0

Year Published

2007
2007
2021
2021

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 17 publications
0
8
0
Order By: Relevance
“…One popular algorithm is: (1) perform the forward FFT on the signal; (2) make the components of negative frequencies zero; (3) calculate the inverse Fast Fourier Transform (iFFT) to obtain the analytic signal; and (4) take the imaginary portion as the HT of the original signal. Since the FPGA has been proven to be far more efficient in embedding the FFT versus its microprocessor/Digital Signal Processor counterpart [32], the selection of this algorithm may be rational and provides an exact solution.…”
Section: Embedding Hilbert Transform (Ht)mentioning
confidence: 99%
“…One popular algorithm is: (1) perform the forward FFT on the signal; (2) make the components of negative frequencies zero; (3) calculate the inverse Fast Fourier Transform (iFFT) to obtain the analytic signal; and (4) take the imaginary portion as the HT of the original signal. Since the FPGA has been proven to be far more efficient in embedding the FFT versus its microprocessor/Digital Signal Processor counterpart [32], the selection of this algorithm may be rational and provides an exact solution.…”
Section: Embedding Hilbert Transform (Ht)mentioning
confidence: 99%
“…For FPGAs to effectively participate in computations, adequate communication paths are required. I/O bottlenecks between the processor and FPGA frequently limit reconfigurable systems from greater performance [Underwood et al 2006]. Given the need for parallelism, scalability, and throughput, research is converging to on-chip networks as the connection architecture of choice [Dally and Towles 2001;Kapre et al 2006;Mak et al 2006;Pionteck et al 2007].…”
Section: Benefits and Challengesmentioning
confidence: 99%
“…Underwood et al proposed a technique to predict execution time of FFT on FPGAs [13]. Similar to us, they construct a performance model of FFT by dividing the total execution into several sub steps, and derive the model parameters from profiling results.…”
Section: Related Workmentioning
confidence: 99%