2019
DOI: 10.3390/electronics8030295
|View full text |Cite
|
Sign up to set email alerts
|

Optimized Compression for Implementing Convolutional Neural Networks on FPGA

Abstract: Field programmable gate array (FPGA) is widely considered as a promising platform for convolutional neural network (CNN) acceleration. However, the large numbers of parameters of CNNs cause heavy computing and memory burdens for FPGA-based CNN implementation. To solve this problem, this paper proposes an optimized compression strategy, and realizes an accelerator based on FPGA for CNNs. Firstly, a reversed-pruning strategy is proposed which reduces the number of parameters of AlexNet by a factor of 13× without… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 58 publications
(23 citation statements)
references
References 21 publications
0
23
0
Order By: Relevance
“…In order to obtain a more robust platform, layers templates will be improved by adding the capacity to infer quantized and pruned convolutional neural networks, reducing drastically the number of operations and memory required for benchmark CNNs (e.g., AlexNet), as explained in [27].…”
Section: Discussionmentioning
confidence: 99%
“…In order to obtain a more robust platform, layers templates will be improved by adding the capacity to infer quantized and pruned convolutional neural networks, reducing drastically the number of operations and memory required for benchmark CNNs (e.g., AlexNet), as explained in [27].…”
Section: Discussionmentioning
confidence: 99%
“…In FPGA technology, compression techniques are suitable to reduce redundant parameters and memory footprint, which has direct impact in the power consumption, speed and resource use [32][33][34][35]. Cheng et al [36] presented a review of the state of the art in compression techniques, summarizing the different approaches in: parameter pruning and quantization, low-rank factorization, transferred/compact convolutional filters and knowledge distillation.…”
Section: Perspectives For An Fpga Realization Of the Nlcnmentioning
confidence: 99%
“…DiracDeltaNet [ 40 ] is based on ShuffleNet [ 49 ], but replaces convolutions with shift operations and uses PACT quantization to classify at 58.7 fps on an FPGA. The architecture described in [ 50 ] uses reverse-pruning and peak-pruning strategies to improve the compression factor in AlexNet [ 16 ] without sacrificing accuracy. The authors of [ 51 ] create a design flow to implement CNN inference in FPGA-based SoCs using high-level synthesis (HLS).…”
Section: Related Workmentioning
confidence: 99%