2020
DOI: 10.1080/03772063.2020.1821797
|View full text |Cite
|
Sign up to set email alerts
|

Efficient CNN Accelerator on FPGA

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(7 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…As shown in Figure 6, according to different design concepts and requirements, FPGA-based neural network optimization technology can be roughly divided into optimization for data and operation, optimization for bandwidth, and optimization for memory and access, among others, which are introduced in detail below. [71][72][73][74][75][76][77][78], less computations [79][80][81], improve calculation speed [82][83][84][85], Winograd fast convolution algorithm [86][87][88][89][90][91], Im2col convolution optimization algorithm [92][93][94][95][96][97], pipelined design [98][99][100][101][102], Roofline model [103][104][105], ping-pong cache [106][107][108][109], input feature map reuse [110,111], filter reuse [111,112], convolutional reuse [110]…”
Section: Neural Network Optimization Technology Based On Fpgamentioning
confidence: 99%
See 1 more Smart Citation
“…As shown in Figure 6, according to different design concepts and requirements, FPGA-based neural network optimization technology can be roughly divided into optimization for data and operation, optimization for bandwidth, and optimization for memory and access, among others, which are introduced in detail below. [71][72][73][74][75][76][77][78], less computations [79][80][81], improve calculation speed [82][83][84][85], Winograd fast convolution algorithm [86][87][88][89][90][91], Im2col convolution optimization algorithm [92][93][94][95][96][97], pipelined design [98][99][100][101][102], Roofline model [103][104][105], ping-pong cache [106][107][108][109], input feature map reuse [110,111], filter reuse [111,112], convolutional reuse [110]…”
Section: Neural Network Optimization Technology Based On Fpgamentioning
confidence: 99%
“…In 2019, Asgar Abbaszadeh et al [83] proposed a universal square matrix computing unit that was based on cyclic matrix structure and finally tested a 500 × 500 matrix on an FPGA with an operating frequency of 346 MHz, achieving a throughput of 173 GOPS. In 2020, S. Kala and S. Nalesh [84] proposed an efficient CNN accelerator that was based on block Winograd GEMM (general matrix multiplication) architecture. Using blocking technology to improve bandwidth and storage efficiency, the ResNet-18 CNN model was implemented on XC7VX690T FPGA.…”
Section: ) Winograd Fast Convolution Algorithmmentioning
confidence: 99%
“…CNNs are composed of multiple layers of operations, such as convolution, pooling, ReLu, local response normalization, fully connected computation, and softmax [22], where the convolution layers are the key layers of the CNNs. Convolution operations are inspired by biological processes [23] in that the connectivity pattern between neurons resembles the organization of the human visual cortex.…”
Section: Background 221 Convolution Operationmentioning
confidence: 99%
“…[72] implemented hybrid convolution on FPGA and analysed the occasions suitable for FFT and Winograd convolution. [35], [73], [74], [75] unified the realization of the Winograd convolution kernel matrix multiplication and maximize the reusability of the module. [76], [77] conducted a comprehensive design space exploration on the realization of Winograd convolution on FPGA.…”
Section: Cpumentioning
confidence: 99%