2017 International Conference on Field Programmable Technology (ICFPT) 2017
DOI: 10.1109/fpt.2017.8280160
|View full text |Cite
|
Sign up to set email alerts
|

PipeCNN: An OpenCL-based open-source FPGA accelerator for convolution neural networks

Abstract: Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being computeintensive, CNN computations are mainly accelerated by GPUs with high power dissipations. Recently, studies were carried out exploiting FPGA as CNN accelerator because of its reconfigurability and energy efficiency advantage over GPU, especially when OpenCL-based high-level synthesis tools are now available providing fast verification and implement… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
74
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 77 publications
(74 citation statements)
references
References 7 publications
0
74
0
Order By: Relevance
“…The developed hardware architectures consist of C++ host controllers and multiple OpenCL kernels, which are accelerated using either an FPGA or a GPU. For x86-based systems, OpenCL accelerated kernels using FPGAs typically reside on an FPGA development board, which is connected to a separate independent host system through the PCIe express interface [2]. For ARM-based systems, the FPGA is typically connected to a Hard Processor System (HPS) on a SoC through specialized bridges, as in the case of the Intel DE1-SoC development board that we used here.…”
Section: A Software Architecturementioning
confidence: 99%
“…The developed hardware architectures consist of C++ host controllers and multiple OpenCL kernels, which are accelerated using either an FPGA or a GPU. For x86-based systems, OpenCL accelerated kernels using FPGAs typically reside on an FPGA development board, which is connected to a separate independent host system through the PCIe express interface [2]. For ARM-based systems, the FPGA is typically connected to a Hard Processor System (HPS) on a SoC through specialized bridges, as in the case of the Intel DE1-SoC development board that we used here.…”
Section: A Software Architecturementioning
confidence: 99%
“…Our work completes their analysis. The group of Don Wang [17] developed an FPGA framework for image classification and a comparison of the CNN models AlexNet and VGG-16 [7] on both Altera and Xilinx FPGAs. In this case, the shortest classification time is achieved by Altera DE5-net and is 23 FPS for AlexNet and 1.4 FPS for VGG-16, with a power consumption of 27.3 Watt and 29.8 Watt respectively.…”
Section: Related Workmentioning
confidence: 99%
“…There are several OpenCL frameworks for DNN deployment on FPGAs. Among these available frameworks, we are using PipeCNN created by Wang et al [19], which is the only open-source one. Our next step is to deploy pretrained models to the FPGA using PipeCNN and investigate the real-time inference performance of the FPGA for different models and input image sizes.…”
Section: Future Workmentioning
confidence: 99%
“…ResNet-32 with input size 32 by 32 has only 0.46 million parameters, while AlexNet and VGG, both of which are used in the acceleration of CNNs on FPGAs by researchers like Suda et al[18] and Wang et al[19], have 60 million and 138 million parameters respectively.III. METHODOLOGYOur experiments are completed on a Ubuntu 16.04 LTS machine with an Intel i7-7700K 4.2 GHz CPU and an NVIDIA GTX 1050 Ti GPU with 16 GB memory.…”
mentioning
confidence: 99%