Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181
DOI: 10.1109/icassp.1998.681704
|View full text |Cite
|
Sign up to set email alerts
|

FFTW: an adaptive software architecture for the FFT

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
1,020
0
11

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 1,327 publications
(1,032 citation statements)
references
References 16 publications
1
1,020
0
11
Order By: Relevance
“…All algorithms were implemented in C and tested on an AMD Athlon TM XP 2700+ with 2GB main memory, SuSe-Linux (kernel 2.6.5-7.151-default, gcc 3.3.5) using double precision arithmetic. Moreover, we have used the libraries FFTW 3.0.1 [6] and the NFFT 3.0 library [8], now including the fast NFSFT algorithms. Throughout our experiments we have applied the NFFT routines with precomputed Kaiser-Bessel functions and an oversampling factor of two.…”
Section: Examplesmentioning
confidence: 99%
“…All algorithms were implemented in C and tested on an AMD Athlon TM XP 2700+ with 2GB main memory, SuSe-Linux (kernel 2.6.5-7.151-default, gcc 3.3.5) using double precision arithmetic. Moreover, we have used the libraries FFTW 3.0.1 [6] and the NFFT 3.0 library [8], now including the fast NFSFT algorithms. Throughout our experiments we have applied the NFFT routines with precomputed Kaiser-Bessel functions and an oversampling factor of two.…”
Section: Examplesmentioning
confidence: 99%
“…Current limited prototypes for dense matrixmultiplication (ATLAS [26] and PHIPAC [5]) sparse matrix-vector-multiplication (Sparsity [17,16], and FFTs (FFTW [13,12]) show that we can frequently do as well as or even better than hand-tuned vendor code on the kernels attempted.…”
Section: Librariesmentioning
confidence: 99%
“…The one-dimensional, Q-point FFTs in steps 1 and 5 of the PCFFT are computed with the 1-D FFTW [4] package. Table 2 compares the execution times in seconds of the 3-D FFTW [4] and our parallel crystallographic FFT. We also computed the speed up ratios between the 3-D FFTW and the PCFFT.…”
Section: Computer Experimentsmentioning
confidence: 99%
“…The less classical but more common speed-up definition, which is the ratio between the execution times of the parallel method in P processors and the parallel method in one processor, is meaningless in our case, since the PCFFT in one processor is on average, far less efficient than the 3-D FFTW. We also compared the PCFFT run times with those of the parallel MPI-FFTW [4]. It turns out, however, that due to the load unbalancing induced by the irregularity of the problem, and above all, the large data array transpositions and communications that were required, the MPI-FFTW performed very poorly in our system.…”
Section: Computer Experimentsmentioning
confidence: 99%