2019 29th International Conference on Field Programmable Logic and Applications (FPL) 2019
DOI: 10.1109/fpl.2019.00030
|View full text |Cite
|
Sign up to set email alerts
|

A High-Performance CNN Processor Based on FPGA for MobileNets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
34
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 108 publications
(34 citation statements)
references
References 9 publications
0
34
0
Order By: Relevance
“…Work [47] improved the computational parallelism by separating the convolution computation and other data processing such as pooling and full connection. Bai et al [24] and Wu et al [22] proposed specific CNN accelerators for implementing depth-wise separable convolution onto FPGA. However, these works did not use fast algorithms to reduce the computational cost of convolution operations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Work [47] improved the computational parallelism by separating the convolution computation and other data processing such as pooling and full connection. Bai et al [24] and Wu et al [22] proposed specific CNN accelerators for implementing depth-wise separable convolution onto FPGA. However, these works did not use fast algorithms to reduce the computational cost of convolution operations.…”
Section: Related Workmentioning
confidence: 99%
“…In practical applications, fieldprogrammable gate array (FPGA) is a popular option for designing AUVs due the conveniences of expanding peripheral interfaces and customizing special hardware control logic. Compared with focusing on balancing the computation parallelism and the memory bandwidth [16][17][18][19][20][21][22][23][24], the research focusing on optimizing the implementation of convolution computation onto FPGA have attracted more attention recently. Converted convolution into general matrix multiplication (GEMM) could reduce the times of memory access [25].…”
Section: Introductionmentioning
confidence: 99%
“…Meanwhile, many excellent algorithms have designed to accelerate the basic functions of CNN inference. In [23], two dedicated computing engines named Conv Engine and Dwcv Engine were designed for pointwise convolution and depthwise convolution to improve the efficiency. In [15], the authors aimed to accelerate sparse CNNs.…”
Section: Related Workmentioning
confidence: 99%
“…They use a smaller FPGA, but we expect that if they scaled up to a larger FPGA, their DSP-to-logic utilization ratio would remain roughly the same and their accelerator would still be unable to take advantage of all of the available multipliers. Table IV shows a comparison of HPIPE to the V100 GPU running MobileNet-V1 and a comparison of HPIPE to the FPGA accelerator from Wu et al [27] running MobileNet-V2. NVIDIA does not report accuracy for their implementation of MobileNet-V1.…”
Section: B Sparse Cnn On Fpgamentioning
confidence: 99%