Abstract:We have suggested a new data-access scheme for the computation of lifting two-dimensional (2-D) discrete wavelet transform (DWT) without using data transposition. We have derived a linear systolic array directly from the dependence graph (DG) and a 2-D systolic array from a suitably segmented DG for parallel and pipeline implementation of 1-D DWT. These two systolic arrays are used as building blocks to derive the proposed transposition-free structure for lifting 2-D DWT. The proposed structure requires only a… Show more
“…Due to less ADP, we have chosen the proposed radix-8 generic-constant fixed-width multiplier design over the existing radix-4 generic-constant Booth multiplier design to develop area-delayefficient fixed-point architectures for computation of 2-D DWT. We have considered the existing multiplier-based 2-D DWT structure of [13] and modified this structure to take advantage of radix-8 generic-constant multiplier.…”
Section: Comparison Of Synthesis Resultsmentioning
confidence: 99%
“…Since, no redundant computation is available in lifting 2-D DWT, there is no scope to reduce multiplier complexity without compromising on the throughput rate. However, we observe that many multipliers of block-lifting 2-D DWT structures [12,13,15,17] share a common input operand. A group of multipliers with a common multiplying operand can select their partial product terms from a common set using Booth encoding scheme.…”
Section: Introductionmentioning
confidence: 91%
“…Interestingly, the lifting-based 2-D DWT structure involves constant multiplication. The lifting 2-D DWT structure of block size P involves 4.5P multipliers [13]. One operand of these 4.5P multipliers is one of the four lifting constants {α, β, γ , δ} or two scaling constants {k 2 , 1/k 2 }.…”
Section: Comparison Of Area-delay Complexity Of Lifting 2-d Dwt Strucmentioning
In this paper, we present a regular partial product array (PPA) for radix-8 Booth multiplication by removing the extra row with a small overhead complexity. A radix-8 multiplier design is proposed based on the regular PPA which offers a saving of 10.7 % area-delay product (ADP) over the existing radix-8 multiplier design. The n lower-order bits of 2n bit output of full-width multiplier are truncated to have a fixed-width multiplier with low truncation error, where n is the operand bit-width. Few redundant logic operations are created in the adder unit when n lower-order bits of 2n-bit multiplier output are truncated. A specific design is necessary as the modern synthesis tools partially remove these redundant logics. We present an optimized adder unit design after removing redundant logic for post-truncated fixed-width radix-8 Booth multiplier. Comparison result shows that the proposed post-truncated fixedwidth multiplier design offers nearly 20.7 % ADP and 18.3 % power saving over the existing radix-8 design optimized by the Synopsys Design Compiler when 2n-bit output is post-truncated to n-bit. More often, multipliers are used for multiplication of constant. The value of the constant may be fixed or could be changed during runtime by the user. The multiplier that multiplies fixed constant is referred to fixedconstant multiplier and that multiplies constant which changes during run-time is referred to generic-constant multiplier. Both radix-4 and radix-8 Booth multiplier designs easily can be configured for a generic-constant multiplier. However, radix-8 multiplier design offers to save some area and delay when configured for constant multiplication, while the radix-4 multiplier design does not have this feature. We find B Abhishek Choubey Circuits Syst Signal Process that the proposed 12-bit full-width and fixed-width radix-8 generic-constant multiplier designs, respectively, involve 19.4 and 24.7 % less ADP than the existing radix-4 fullwidth and post-truncated multiplier designs configured for constant multiplication. The existing block-based lifting 2-D DWT structure is synthesized using the proposed radix-8 generic-constant fixed-width multiplier design to demonstrate the effectiveness of proposed multiplier designs. We find that the existing lifting 2-D DWT structure of block size 16 and word length 12 offers 19.3 % ADP saving and 11.5 % power saving when the constant multipliers are implemented using the proposed radix-8 multiplier design instead of the existing radix-4 multiplier design.
“…Due to less ADP, we have chosen the proposed radix-8 generic-constant fixed-width multiplier design over the existing radix-4 generic-constant Booth multiplier design to develop area-delayefficient fixed-point architectures for computation of 2-D DWT. We have considered the existing multiplier-based 2-D DWT structure of [13] and modified this structure to take advantage of radix-8 generic-constant multiplier.…”
Section: Comparison Of Synthesis Resultsmentioning
confidence: 99%
“…Since, no redundant computation is available in lifting 2-D DWT, there is no scope to reduce multiplier complexity without compromising on the throughput rate. However, we observe that many multipliers of block-lifting 2-D DWT structures [12,13,15,17] share a common input operand. A group of multipliers with a common multiplying operand can select their partial product terms from a common set using Booth encoding scheme.…”
Section: Introductionmentioning
confidence: 91%
“…Interestingly, the lifting-based 2-D DWT structure involves constant multiplication. The lifting 2-D DWT structure of block size P involves 4.5P multipliers [13]. One operand of these 4.5P multipliers is one of the four lifting constants {α, β, γ , δ} or two scaling constants {k 2 , 1/k 2 }.…”
Section: Comparison Of Area-delay Complexity Of Lifting 2-d Dwt Strucmentioning
In this paper, we present a regular partial product array (PPA) for radix-8 Booth multiplication by removing the extra row with a small overhead complexity. A radix-8 multiplier design is proposed based on the regular PPA which offers a saving of 10.7 % area-delay product (ADP) over the existing radix-8 multiplier design. The n lower-order bits of 2n bit output of full-width multiplier are truncated to have a fixed-width multiplier with low truncation error, where n is the operand bit-width. Few redundant logic operations are created in the adder unit when n lower-order bits of 2n-bit multiplier output are truncated. A specific design is necessary as the modern synthesis tools partially remove these redundant logics. We present an optimized adder unit design after removing redundant logic for post-truncated fixed-width radix-8 Booth multiplier. Comparison result shows that the proposed post-truncated fixedwidth multiplier design offers nearly 20.7 % ADP and 18.3 % power saving over the existing radix-8 design optimized by the Synopsys Design Compiler when 2n-bit output is post-truncated to n-bit. More often, multipliers are used for multiplication of constant. The value of the constant may be fixed or could be changed during runtime by the user. The multiplier that multiplies fixed constant is referred to fixedconstant multiplier and that multiplies constant which changes during run-time is referred to generic-constant multiplier. Both radix-4 and radix-8 Booth multiplier designs easily can be configured for a generic-constant multiplier. However, radix-8 multiplier design offers to save some area and delay when configured for constant multiplication, while the radix-4 multiplier design does not have this feature. We find B Abhishek Choubey Circuits Syst Signal Process that the proposed 12-bit full-width and fixed-width radix-8 generic-constant multiplier designs, respectively, involve 19.4 and 24.7 % less ADP than the existing radix-4 fullwidth and post-truncated multiplier designs configured for constant multiplication. The existing block-based lifting 2-D DWT structure is synthesized using the proposed radix-8 generic-constant fixed-width multiplier design to demonstrate the effectiveness of proposed multiplier designs. We find that the existing lifting 2-D DWT structure of block size 16 and word length 12 offers 19.3 % ADP saving and 11.5 % power saving when the constant multipliers are implemented using the proposed radix-8 multiplier design instead of the existing radix-4 multiplier design.
“…Unlike RPA-based designs, folded design involves simple control circuitry and it has 100 % HUE. Keeping this in view, several architectures based have been proposed for efficient implementation of lifting 2-D DWT [5][6][7][8][9][10][11][12]. Most of the designs differ by their number of arithmetic components, on-chip memory, cycle period and throughput rate.…”
In this paper we have proposed a look-up-table (LUT) based structure for high-throughput implementation of multilevel lifting DWT. The proposed structure can process one block of samples to achieve high-throughput rate. Compared with the best of the similar existing structure, it does not involves any multipliers but it requires more adders and 21504 extra ROM words for J=3; its offers less critical path delay as compared to exiting structure. Synthesis results show that proposed structure has less ADP 56% less area and 13% less power compared to existing structure for block size J=2. Similarly proposed structure has 64% ADP and less power 21% as compared to existing structure for J=3. The proposed structure is fully scalable for higher block-sizes and it can offer flexibility to derive area-delay efficient structures for various applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.