A nesting discrete Fourier transform technique is proposed for computing discrete Fourier transforms. This technique relies on only two primitive modules and other modules are generated by a standard nesting procedure. In software realization, the speed of computation of this approach is comparable to the speed of computation of Winograd Fourier transform algorithms, whereas the program size of the present approach is smaller than that of WFTAs. This approach is most suitable for cases where there are restrictions on the memory size. For a hardware realization, two simple systolic cells are suggested for the realization of long DFTs using a pipeline systolic structure. This new architecture is most suitable for realization using VLSI techniques and requires significantly fewer devices compared to methods reported before.
IntroductionRecently, numerous efforts have been devoted to the high performance realization of digital signal processing algorithms using parallel computer architectures. However, the performance of parallel computing systems is strongly affected by the complexity of inter-processor communication and the exploitation of concurrency. The key issue to obtain high efficiency is to keep a good balance among system control, data communication and computation complexity. Systolic array (Kung 1982) aim at making a good compromise among these factors while maintaining simple and highly regular architectures. This allows the huge computing capacities offered by contemporary VLSI technologies to be more readily utilized. Unfortunately, it is often difficult to map conventional fast algorithms to efficient parallel computing hardware structures. Problems usually arise from complicated dependency structures. A commonly used approach is to parallelize a high level operation directly from its basic definition. The speed-up is then achieved by employing multiple functional units. This results in increased hardware cost that is often not justifiable when compared to the drastic reduction in computation complexity of fast algorithms, especially in application specific systems.The discrete Fourier transform (DFT) is one of the most important tools in modern digital signal processing applications. Efficient algorithms for the realization of the discrete Fourier transform are desirable and critical to the success of many systems. The fast Fourier transform (FFT), which allows a tremendous saving in the calculation of discrete Fourier transforms was introduced by Cooley and Tukey (1965). The number of multiplications is still relatively large owing to the numerous twiddle factors required. There is a further penalty that either a large storage memory or a time consuming run-time evaluation must be used to supply these twiddle factors in actual implementation. On the other hand, the global data communication nature of its butterfly structure imposes many