High Performance 8-bit Approximate Multiplier using Novel 4:2 Approximate Compressors for Fast Image Processing

Ranjbar, Fatemeh; Forghani, Yahya; Bahrepour, Davoud

doi:10.30880/ijie.2018.10.01.018

Cited by 15 publications

(10 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hence, using this as an upper bound error, the proposed approximation achieves less than 5% error for any filter of size greater than 50. To evaluate the effect of the approximation on the quality of the output signal, we feed a sinusoidal signal with the fundamental frequency and its third harmonic and a sawtooth 9.03 Venka [35] 0.63 1.25 × 10 −3 1.3 Akbar [38] 0.84 2.93 × 10 −3 4.82 Sabetz [39] 0.98 1.93 × 10 −3 9.52 Ahma [40] 0.77 1.47 × 10 −3 1.7 Yang1 [36] 0.04 4.79 × 10 −5 0.02 Yang2 [36] 0.2 2.50 × 10 −4 0.2 Yang3 [36] 0.28 3.68 × 10 −4 0.3 Lin [37] 0.04 9.16 × 10 −5 0.04 Ranjbar1 [41] with the 3rd harmonic sinusoidal signal to a Chebychev lowpass filter with the order of 20. Fig.…”

Section: A On Accuracymentioning

confidence: 99%

“…Resource Utilization (µm 2 ) (µW ) Conventional 6157 803.76 M. Kumm [16] 3832-6080 598-835 Approximate Momeni [34] 5169.77 706.36 Venka [35] 5166.76 702.38 Akbar [38] 5047.18 695.93 Sabetz [39] 4846.58 686.69 Ahma [40] 4927.24 686.46 Yang1 [36] 6134.77 816.09 Yang2 [36] 6085.78 808.06 Yang3 [36] 5802.12 791.88 Lin [37] 6029.05 807.71 Ranjbar1 [41] 5252.12 716.56 Ranjbar2 [41] 5100.72 699.25 Rangbar3 [41] 5136.82 696.91 Kong [42] 5332.80 718.6 Proposed 5337.14 602.16 that use less area and gain more accuracy. Nevertheless, our design is above average among the designs.…”

Section: Algorithmmentioning

confidence: 99%

See 1 more Smart Citation

Efficient, Geometry-Based Convolution

2022

View full text Add to dashboard Cite

Several computationally intensive applications in machine learning, signal processing, and computer vision call for convolution between a fixed vector and each of the incoming vectors. Often, the convolution need not be exact because a subsequent processing unit, such as an activation function in a neuron network or a visual unit in image processing, can tolerate a computational error, hence allowing the optimization of the convolution algorithm. This paper develops a method of approximate convolution and quantifies its performance in software and hardware. The key idea is to take advantage of the known fixed vector, view a convolution as a dot product, and approximate the angles between the fixed vector and an incoming vector geometrically. We evaluate the proposed method in terms of the accuracy, running time complexity, and hardware power consumption on the field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) hardware platforms. In a benchmark test, the accuracy of the approximate convolution is 3.7% lower than that of the exact convolution, a tolerable loss for machine learning and signal processing. The proposed method reduces the number of operations in the hardware and reduces the power consumption of conventional convolution and the existing approximate convolution by approximately 20% and 10%, respectively, while maintaining the same throughput and latency. We also test the proposed method on 2D convolution and convolutional neural network (CNN). The proposed method reduces the complexity, power consumption for 2D convolution, power consumption for CNN of the conventional method by approximately 22%, 25%, and 13%, respectively. The proposed method of approximate convolution trades off accuracy with running time complexity and hardware power consumption, and it has practical utility in computationally intensive tasks that tolerate a margin of convolutional error.INDEX TERMS Convolution, dot product, power consumption, field programmable gate array (FPGA)

show abstract

Section: A On Accuracymentioning

confidence: 99%

Section: Algorithmmentioning

confidence: 99%

Efficient, Geometry-Based Convolution

2022

View full text Add to dashboard Cite

show abstract

“…In order to compare the structure of existing 4:2 in-exact compressor designs, we have analyzed and reported all designs [24][25][26][27][28][29][37][38][39][40] in terms of number of logic gates required, error encountered and number of majority-3 gate required for its implementation. In the reported design [24][25][26][27][28][29] the implementation is not done by using majority-3 gates, then a simple rule of thumb is used to calculate number of majority-3 gates. In QCA all the fundamental gates except XOR and XNOR are implemented by using a single majority-3 gate by charging the third input to +1 or -1.…”

Section: Related Workmentioning

confidence: 99%

“…The XOR and XNOR gates are implemented by using three majority-3 gates design. The table1 depicts the design of compressor (both exact and in-exact) [24][25][26][27][28][29] using logic gates. Here in all the literature gates (area) is optimized by decreasing accuracy of operation.…”

Section: Related Workmentioning

confidence: 99%

“…The in-ability of human eye to detect minute change in the output image inspired the designers to design in-exact (approximate) compressor. The in-exact compressor achieves reduced area and power performance at the cost of compromising accuracy [24][25][26][27][28][29][30]. As per the state of art various researchers designed the 4:2 inexact compressor by considering error distance and hardware optimization are its performance parameters.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Design and Analysis of Efficient Exact and In-Exact reversible 4:2 Compressor using QCA

Seethur

Nithin

Karwa

2022

Preprint

View full text Add to dashboard Cite

Quantum-dot cellular automata (QCA) is a pioneer alternative solution for conventional Complementary Metal Oxide Semiconductor (CMOS) technology limitation in upcoming years. The major technology limitations of conventional CMOS are power dissipation and scaling which can be surpassed by QCA. In QCA loss of information plays dominant role in the power dissipation of the system design. The issue of power consumption is addressed by designing the system using reversible logic. Multiplier and adders are the two important blocks present in applications such as multimedia and artificial intelligence. 4:2 compressor is one of the vital components of any multiplier design. In applications such as multimedia since human eye is not able to identify small changes, approximate 4:2 compressor has gained the highest significance. Designing in-accurate 4:2 compressor with minimum error distance and improved electrical performance is highly demanded. In this paper a novel reversible inexact 4:2 compressor design is presented. The proposed design is based on majority-3 and majority-5 gates. The new compressor advances the existing design in terms of error distance in Carry out. The simulations were carried out in QCA Designer. The design exhibits the improvements in optimization parameters such as cell count and garbage output.

show abstract

A high‐efficient imprecise discrete cosine transform block based on a novel full adder and Wallace multiplier for bioimages compression

Esmaeili

Pesaran

Shiri

2023

Circuit Theory & Apps

View full text Add to dashboard Cite

Sophisticated systems take advantage of approximate circuits for energy management. A gate-level structure is introduced to make a novel imprecise full adder (FA) cell by the gate diffusion input (GDI) strategy. The carbon nanotube field-effect transistor (CNTFET) technology lowers the FA power, and the swing issue is resolved by the dynamic threshold (DT) technique. The FA specifications are 10 transistors, two internal nodes, 0.191-μm 2 area, three errors, and an error rate (ER) of 37.5%. Some inputs are directly passed to the final stage of the FA to reduce the delay and dynamic power. The FA is used in three 4:2 compressors with 12, 16, and 20 transistors, respectively. The compressors are based on the modified stacking concept to attain low power, small area, and high speed. The compressors are used up in the partial product reduction tree (PPRT) of a new 8-bit inexact multiplier. Regarding the multiplier, the power-delay product (PDP) is decreased by 25.21%, and the normalized mean error distance (NMED) shows high accuracy. An imprecise discrete cosine transform (DCT) is implemented by the multiplier for image compression, and compared with the literature, the peak signal-to-noise ratio (PSNR) and structural similarity index metric (SSIM) are amended by 8.26% and 7.29%, respectively.

show abstract

High Performance 8-bit Approximate Multiplier using Novel 4:2 Approximate Compressors for Fast Image Processing

Cited by 15 publications

References 18 publications

Efficient, Geometry-Based Convolution

Efficient, Geometry-Based Convolution

Design and Analysis of Efficient Exact and In-Exact reversible 4:2 Compressor using QCA

A high‐efficient imprecise discrete cosine transform block based on a novel full adder and Wallace multiplier for bioimages compression

Contact Info

Product

Resources

About