Abstract. Fast Fourier Transform (FFT) and Discrete Cosine Transform (DCT) processors are designed and implemented using a Xilinx Field Programmable Gate Array (FPGA) device XC 4010. This device allows a 16-point FFF/DCT processor implementation. To design the CLB-efficient FFWDCT processor in FPGAs, a pipelined bit-serial architecture with bit-parallel input data format is employed. These processors operate with a 20 MHz bit-clock and 16-bit system word length, and compute an entire 16-point DFF/DCT transform for every 16-bit clock cycle.
Absmct-This paper presents the development of a fixedpoint bit-parallel multiply-add f d (MAF) architecture together with a corresponding V U 1 implementation. The proposed MAF implementation employs the 1.w CMOS technology provided by the Northem Telecom Electronics and available through the Cnnadian Micmdectrpnics Corporation (CMC). This MAF implementation finds a variety of practical applications in high-speed real-time digital signal processing (DSP). The MAF implementation employs a parallel modified Booth multiplier incorporating M array of carry-save adders for the addition 01 the intermediate partial products, and a hardware eifldent carry-skip adder for carry propagation.
This paper presents combined area-efficient and time-efficient systolic architectures for parallel Booth multiplication. These systolic architectures employ composite (instead of fine grained) cells in order to optimize the silicon area and latency. The complexity of the composite cell is controllable by choosing the proper input size. The composite cell design takes advantages of algorithmic improvements within the cell. These cells are connected only to the neighbors. IntroductionRecent advancements in IC fabrication technology with respect to the reduction in minimum feature size, enlargement in chip size, and improvement in packing efficiency have made it feasible to put millions of transistors in a single chip. Miniaturization of MOS transistor dimensions continues to improve the circuit speed and packing density, whereas interconnection capacitance and resistance increases linearly with respect to the scaling factor. With increasing chip dimensions, parasitic interconnection capacitance dominates the gate capacitance, and hence RC time consrant of interconnection determines the circuit performance. Such factors become predominant particularly in ULSI technology. For ULSI environments, a new design philosophy locally long but globally short interconnections has been proposed in [1][2], which permits a complex (composite) cell design rather than a fine grained cell for systolic architectures. This paper exploits the concept to modified Booth parallel multiplication. The direct implementation of multiplication using shift-and-add parallel array requires O(n2) silicon area and 3n -2 cell delays for ann x n multiplication [3], each cell consists of a carry save adder, AND gate and data latches.A bit-parallel systolic array has been proposed for multiplication by McCanny and McWhirter[4]. It consists a diamond-shape array of n2 latched carry save adder cells. Each cell is connected to the nearest neighbors, but required 3n clock cycles. A one dimensional serial systolic multiplier array has been proposed for modified Booth implementation [5]. It consists of n/2 cells but each cell contains a ripple carry adder, and it requires 2n -1 clock cycles. A recent floating-point design exploits modified Booth algorithm using Wallace tree reduction and Dadda parallel counter [61.In this paper, we present a novel area and time efficient systolic parallel modified Booth multiplier. A suitable computational model for ULSI has been adopted from the existing VLSI models. Taking wire delays and cell delays into account, we propose systolic architectures with complex cells, which require less area and latency as compared to the fine grained implementation. The complex cell design enables to take advantage of the developments in parallel multiplier, which improves of the cell. Then we show our design approach is viable and results in an area and time efficient implementation. 2.Scaling MOS transistor reduces the dimensions of a device as well as increasing its speed. On the other hand, for interconnections RC constant per unit len...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.