Area- and Power-Efficient Architecture for High-Throughput Implementation of Lifting 2-D DWT

Mohanty, Basant Kumar; Mahajan, Anurag; Meher, Pramod Kumar

doi:10.1109/tcsii.2012.2200169

Cited by 45 publications

(71 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to less ADP, we have chosen the proposed radix-8 generic-constant fixed-width multiplier design over the existing radix-4 generic-constant Booth multiplier design to develop area-delayefficient fixed-point architectures for computation of 2-D DWT. We have considered the existing multiplier-based 2-D DWT structure of [13] and modified this structure to take advantage of radix-8 generic-constant multiplier.…”

Section: Comparison Of Synthesis Resultsmentioning

confidence: 99%

“…Since, no redundant computation is available in lifting 2-D DWT, there is no scope to reduce multiplier complexity without compromising on the throughput rate. However, we observe that many multipliers of block-lifting 2-D DWT structures [12,13,15,17] share a common input operand. A group of multipliers with a common multiplying operand can select their partial product terms from a common set using Booth encoding scheme.…”

Section: Introductionmentioning

confidence: 91%

“…Interestingly, the lifting-based 2-D DWT structure involves constant multiplication. The lifting 2-D DWT structure of block size P involves 4.5P multipliers [13]. One operand of these 4.5P multipliers is one of the four lifting constants {α, β, γ , δ} or two scaling constants {k 2 , 1/k 2 }.…”

Section: Comparison Of Area-delay Complexity Of Lifting 2-d Dwt Strucmentioning

confidence: 99%

See 2 more Smart Citations

Efficient Design for Radix-8 Booth Multiplier and Its Application in Lifting 2-D DWT

Mohanty

Choubey

2016

Circuits Syst Signal Process

View full text Add to dashboard Cite

In this paper, we present a regular partial product array (PPA) for radix-8 Booth multiplication by removing the extra row with a small overhead complexity. A radix-8 multiplier design is proposed based on the regular PPA which offers a saving of 10.7 % area-delay product (ADP) over the existing radix-8 multiplier design. The n lower-order bits of 2n bit output of full-width multiplier are truncated to have a fixed-width multiplier with low truncation error, where n is the operand bit-width. Few redundant logic operations are created in the adder unit when n lower-order bits of 2n-bit multiplier output are truncated. A specific design is necessary as the modern synthesis tools partially remove these redundant logics. We present an optimized adder unit design after removing redundant logic for post-truncated fixed-width radix-8 Booth multiplier. Comparison result shows that the proposed post-truncated fixedwidth multiplier design offers nearly 20.7 % ADP and 18.3 % power saving over the existing radix-8 design optimized by the Synopsys Design Compiler when 2n-bit output is post-truncated to n-bit. More often, multipliers are used for multiplication of constant. The value of the constant may be fixed or could be changed during runtime by the user. The multiplier that multiplies fixed constant is referred to fixedconstant multiplier and that multiplies constant which changes during run-time is referred to generic-constant multiplier. Both radix-4 and radix-8 Booth multiplier designs easily can be configured for a generic-constant multiplier. However, radix-8 multiplier design offers to save some area and delay when configured for constant multiplication, while the radix-4 multiplier design does not have this feature. We find B Abhishek Choubey Circuits Syst Signal Process that the proposed 12-bit full-width and fixed-width radix-8 generic-constant multiplier designs, respectively, involve 19.4 and 24.7 % less ADP than the existing radix-4 fullwidth and post-truncated multiplier designs configured for constant multiplication. The existing block-based lifting 2-D DWT structure is synthesized using the proposed radix-8 generic-constant fixed-width multiplier design to demonstrate the effectiveness of proposed multiplier designs. We find that the existing lifting 2-D DWT structure of block size 16 and word length 12 offers 19.3 % ADP saving and 11.5 % power saving when the constant multipliers are implemented using the proposed radix-8 multiplier design instead of the existing radix-4 multiplier design.

show abstract

Section: Comparison Of Synthesis Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 91%

Section: Comparison Of Area-delay Complexity Of Lifting 2-d Dwt Strucmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Design for Radix-8 Booth Multiplier and Its Application in Lifting 2-D DWT

Mohanty

Choubey

2016

Circuits Syst Signal Process

View full text Add to dashboard Cite

show abstract

“…Unlike RPA-based designs, folded design involves simple control circuitry and it has 100 % HUE. Keeping this in view, several architectures based have been proposed for efficient implementation of lifting 2-D DWT [5][6][7][8][9][10][11][12]. Most of the designs differ by their number of arithmetic components, on-chip memory, cycle period and throughput rate.…”

Section: Two-dimensional (2-d) Discrete Wavelet Transform (Dwt)mentioning

confidence: 99%

“…Low-pass block a (9,8) a (10,8) a (11,8) a (12,8) a (13,8) a (14,8) a (15,8) h (4,4) h (4,5) h (4,7) h (4,8) l (4,5) l (4,6) l (4,7) l (4,8) …”

Section: Row-processor Column-processormentioning

confidence: 99%

A Block based Area-Delay Efficient Architecture for Multi-Level Lifting 2-D DWT

Choubey¹,

Mohanty²

2017

IJCA

View full text Add to dashboard Cite

In this paper we have proposed a look-up-table (LUT) based structure for high-throughput implementation of multilevel lifting DWT. The proposed structure can process one block of samples to achieve high-throughput rate. Compared with the best of the similar existing structure, it does not involves any multipliers but it requires more adders and 21504 extra ROM words for J=3; its offers less critical path delay as compared to exiting structure. Synthesis results show that proposed structure has less ADP 56% less area and 13% less power compared to existing structure for block size J=2. Similarly proposed structure has 64% ADP and less power 21% as compared to existing structure for J=3. The proposed structure is fully scalable for higher block-sizes and it can offer flexibility to derive area-delay efficient structures for various applications.

show abstract