2016 IEEE 34th International Conference on Computer Design (ICCD) 2016
DOI: 10.1109/iccd.2016.7753296
|View full text |Cite
|
Sign up to set email alerts
|

CNN-MERP: An FPGA-based memory-efficient reconfigurable processor for forward and backward propagation of convolutional neural networks

Abstract: Abstract-Large-scale deep convolutional neural networks (CNNs) are widely used in machine learning applications. While CNNs involve huge complexity, VLSI (ASIC and FPGA) chips that deliver high-density integration of computational resources are regarded as a promising platform for CNN's implementation. At massive parallelism of computational units, however, the external memory bandwidth, which is constrained by the pin count of the VLSI chip, becomes the system bottleneck. Moreover, VLSI solutions are usually … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 26 publications
(8 citation statements)
references
References 17 publications
0
8
0
Order By: Relevance
“…In addition to publications that focus on the acceleration of DNN inference, some publications tackle the problem of implementing backpropagation for neural network training on FPGAs as well. For example, [92] and [29] implement frameworks for CNN training on FPGAs, and [76] explores the training of LSTM layers on FPGAs. With approaches like these, it would be possible to implement FPGA-based DRL architectures with models including CNN and LSTM layers.…”
Section: Neural Network In Fpga-based Drl Implementationsmentioning
confidence: 99%
“…In addition to publications that focus on the acceleration of DNN inference, some publications tackle the problem of implementing backpropagation for neural network training on FPGAs as well. For example, [92] and [29] implement frameworks for CNN training on FPGAs, and [76] explores the training of LSTM layers on FPGAs. With approaches like these, it would be possible to implement FPGA-based DRL architectures with models including CNN and LSTM layers.…”
Section: Neural Network In Fpga-based Drl Implementationsmentioning
confidence: 99%
“…There is also some special idea on accelerating CNNs. The authors of [15, 16] have utilised the reconfigurability of the FPGA to create a runtime configurable CNNs accelerator. This really saves a lot of resources but spend too much time on configuring the FPGA before computation.…”
Section: Related Workmentioning
confidence: 99%
“…Due to the high-performance, reconfigurability and energy-efficient nature of FPGAs, many FPGA-based accelerators [14][15][16][17][18] have been proposed that can implement CNNs; these have achieved high throughput and improved energy efficiency. Several novel reconfiguration architectures were proposed in [14] that improve the sum-of-products operations used in the convolutional kernels of CNNs.…”
Section: Related Workmentioning
confidence: 99%
“…In [15], a modified Caffe CNN framework is presented; this framework implements CNNs using FPGAs, allowing transparent support to be given to individual FPGA implementation of CNN layers. In 2016, CNN-MERP, a CNN processor incorporating an efficient memory hierarchy, was produced by Han et al [16]; this processor was shown to have significantly lower bandwidth requirements. Bettoni et al [17] proposed an FPGA implementation of CNNs in low-power embedded systems; this study addressed portability and power efficiency.…”
Section: Related Workmentioning
confidence: 99%