A Hardware Accelerator for the Inference of a Convolutional Neural network

Gonzalez, E. David; Luna, Walter D. Villamizar; Ariza, Carlos Augusto Fajardo

doi:10.18359/rcin.4194

Cited by 8 publications

(9 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This consumption is taken by Vivado's power report of the implemented design and consisting of 94 mW static and 534 mW dynamic power. This is nearly 67 % lower than the other LeNet CNN architectures which are around 1800 mW [16] [17]. In the experimental setup, the images are loaded using the serial interface of the board and the result is shown on the LEDs of the board.…”

Section: Discussionmentioning

confidence: 89%

“…In other words, the output of the Python and FPGA designs give exactly the same result in each stage of the CNN. Moreover, for a fair comparison, the proposed accelerator is compared with the other LeNet CNN implementations in the literature having the same number of convolutional and fully connected layers [17] [24] [25]. The design of [24] is using Zynq Ultrascale FPGA and HLS is used in the development stage.…”

Section: Discussionmentioning

confidence: 99%

“…These engines make the convolution operation in a pipelined manner. There are also works using the Zynq series FPGAs, and these works process the data with the help of embedded processor and programmable logic together in the accelerator [16], [17]. Lenet, Alexnet and VGGNet are the most popular CNNs used in the FPGA implementation.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

An Energy-Efficient FPGA-based Convolutional Neural Network Implementation

Irmak

Alachiotis

Ziener

2021

2021 29th Signal Processing and Communications Applications Conference (SIU)

View full text Add to dashboard Cite

Convolutional Neural Networks (CNNs) are a very popular class of artificial neural networks. Current CNN models provide remarkable performance and accuracy in image processing applications. However, their computational complexity and memory requirements are discouraging for embedded realtime applications. This paper proposes a highly optimized CNN accelerator for FPGA platforms. The accelerator is designed as a LeNet CNN architecture focusing on minimizing resource usage and power consumption. Moreover, the proposed accelerator shows more than 2x higher throughput in comparison with other FPGA LeNet accelerators with reaching up 14 K images/sec. The proposed accelerator is implemented on the Nexys DDR 4 board and the power consumption is less than 700 mW which is 3x lower than the current LeNet architectures. Therefore, the proposed solution offers higher energy efficiency without sacrificing the throughput of the CNN.

show abstract

Section: Discussionmentioning

confidence: 89%

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

An Energy-Efficient FPGA-based Convolutional Neural Network Implementation

Irmak

Alachiotis

Ziener

2021

2021 29th Signal Processing and Communications Applications Conference (SIU)

View full text Add to dashboard Cite

show abstract

“…Various CNN implementations on FPGA have been reported in literature [11]- [13], focusing on different aspects, e.g., the optimization of only the convolutional layers [14], [15] or the overall accelerator throughput [16]. There are also SW/HW co-design solutions that exploit the aggregate power of both an embedded processor and the programmable logic [17], [18]. Some [19], [20] present resource-intensive designs and achieve high throughput disregarding power consumption, while others implement binary neural networks (BNNs) that achieve high power efficiency at the cost of reduced accuracy using binary weights/biases [21].…”

Section: Background and Literature Reviewmentioning

confidence: 99%

Increasing Flexibility of FPGA-based CNN Accelerators with Dynamic Partial Reconfiguration

Irmak

Ziener

Alachiotis

2021

2021 31st International Conference on Field-Programmable Logic and Applications (FPL)

View full text Add to dashboard Cite

Convolutional Neural Networks (CNN) are widely used for image classification and have achieved significantly accurate performance in the last decade. However, they require computationally intensive operations for embedded applications. In recent years, FPGA-based CNN accelerators have been proposed to improve energy efficiency and throughput. While dynamic partial reconfiguration (DPR) is increasingly used in CNN accelerators, the performance of dynamically reconfigurable accelerators is usually lower than the performance of pure static FPGA designs. This work presents a dynamically reconfigurable CNN accelerator architecture that does not sacrifice throughput performance or classification accuracy. The proposed accelerator is composed of reconfigurable macroblocks and dynamically utilizes the device resources according to model parameters. Moreover, we devise a novel approach, to the best of our knowledge, to hide the computations of the pooling layers inside the convolutional layers, thereby further improving throughput. Using the proposed architecture and DPR, different CNN architectures can be realized on the same FPGA with optimized throughput and accuracy. The proposed architecture is evaluated by implementing two different LeNet CNN models trained by different datasets and classifying different classes. Experimental results show that the implemented design achieves higher throughput than current LeNet FPGA accelerators.

show abstract

“…AI applications include prediction, recommendation, classification and recognition, object detection, natural language processing, autonomous systems, among others. The topics of the articles in this special issue include deep learning applied to medicine [1,3], support vector machines applied to ecosystems [2], human-robot interaction [4], clustering in the identification of anomalous patterns in communication networks [5], expert systems for the simulation of natural disaster scenarios [6], real-time algorithms of artificial intelligence [7], and big data analytics for natural disasters [8].…”

mentioning

confidence: 99%

Special Issue in Artificial Intelligence

Ballesteros

2019

Cien.Ing.Neogranadina

View full text Add to dashboard Cite

Artificial intelligence (AI) is an interdisciplinary subject in science and engineering that makes it possible for machines to learn from data. Artificial Intelligence applications include prediction, recommendation, classification and recognition, object detection, natural language processing, autonomous systems, among others. The topics of the articles in this special issue include deep learning applied to medicine [1, 3], support vector machine applied to ecosystems [2], human-robot interaction [4], clustering in the identification of anomalous patterns in communication networks [5], expert systems for the simulation of natural disaster scenarios [6], real-time algorithms of artificial intelligence [7] and big data analytics for natural disasters [8].

show abstract

A Hardware Accelerator for the Inference of a Convolutional Neural network

Cited by 8 publications

References 15 publications

An Energy-Efficient FPGA-based Convolutional Neural Network Implementation

An Energy-Efficient FPGA-based Convolutional Neural Network Implementation

Increasing Flexibility of FPGA-based CNN Accelerators with Dynamic Partial Reconfiguration

Special Issue in Artificial Intelligence

Contact Info

Product

Resources

About