Atul Rahman scite author profile

This paper shows a new methodology to design the hardware for computing square root of N-bit unsigned numbers. The proposed hardware design is based on the modified nonrestoring square root algorithm. Two different hardware designs, sequential pipeline architecture and asynchronous architecture for computing N-bit fixed point square root operation are proposed. The synthesis report of the designed FPGA based pipelined hardware for 32-bit square root operation shows that the usage of the logical resources of FPGA is significantly less than that of the earlier proposed pipelined hardware designs based on modified non-restoring algorithm. Moreover, the proposed pipelined hardware design can be configured to calculate square root of 32-bit number in 16 and 8 clock cycles. The maximum frequency achieved for the operation latency of 16-clock cycles for computing 32-bit unsigned square root is 403.770 MHz. The maximum frequency achieved for operating latency of 8-clock cycles is 260.233 MHz. On the other side, proposed asynchronous architecture based FPGA hardware design supersedes the earlier proposed asynchronous hardware designs for N-bit square root operation in terms of the less usage of hardware resources. Both the pipelined and asynchronous hardware designs are tested on Xilinx Virtex 7 XC7VX980T-2, Virtex 5 XC5VLX330T-2 and Spartan 3E XC3S1600E-5 FPGAs.

show abstract

Optimized hardware architecture for implementing IEEE 754 standard double precision floating point adder/subtractor

Rahman

Abdullah-Al-Kafi

Khalid

et al. 2014

View full text Add to dashboard Cite

IEEE 754 standard double precision (64-bit) binary floating point arithmetic unit is often necessary in complex digital signal processing applications. The basic operations, floating point addition and subtraction, need to be optimized to efficiently compute floating point multiplier, divider and square root. However, the main challenge is to design the floating point arithmetic unit hardware that uses fewer logical resources of FPGA and ASIC and has a maximum operating frequency with a fewer number of clock cycles. This paper proposes a new, efficient hardware design methodology for implementing double precision floating point addition and subtraction. The pipeline hardware design is implemented on Virtex-6 and Virtex-5 Xilinx FPGA. As per the synthesis result, the maximum operating frequency achieved for the proposed hardware design for clock latency of 8 cycles is significantly higher than the previous hardware designs. Furthermore, area overhead is 50 percent fewer than that of the earlier proposed hardware designs for computing IEEE 754 compliant double precision floating point addition and subtraction.

show abstract

Efficient High-Level Synthesis for Nested Loops of Nonrectangular Iteration Spaces

Sim

Rahman

Lee

2016

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Atul Rahman

Efficient FPGA Acceleration of Convolutional Neural Networks Using Logical-3D Compute Array

Design space exploration of FPGA accelerators for convolutional neural networks

New efficient hardware design methodology for modified non-restoring square root algorithm

Optimized hardware architecture for implementing IEEE 754 standard double precision floating point adder/subtractor

Efficient High-Level Synthesis for Nested Loops of Nonrectangular Iteration Spaces

Contact Info

Product

Resources

About