Filter-Wise Quantization of Deep Neural Networks for IoT Devices

Kim, Hoseung; Jo, Geunhye; Lee, Hayun; Shin, Dongkun

doi:10.1109/icce50685.2021.9427656

Cited by 2 publications

(2 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There have been several studies to reduce the complexity of neural networks by pruning unimportant network connections; quantizing and applying Huffman conding [2]; using vector quantization to CNN [3]; studying of quantization, coding, pruning, and sharing techniques for image instance retrieval [4]; or by applying filter-wise quantization [5]. Some studies tried to optimize network model architecture by applying complex-optimized computation methods [6,7].…”

Section: Optimization and Acceleration For Embedded Edge Devicesmentioning

confidence: 99%

“…There are two approaches to efficiently optimize DNN in edge devices: optimizing the network at software algorithm level, and using a hardware Deep Learning Accelerator (DLA), and these two approaches are mixed. For software optimization, a lightweight technique to reduce the complexity of the DNN network has been studied such as network layer compression, weight pruning, and filter quantization methods [2][3][4][5][6][7][8][9]. The hardware Deep Learning Accelerator has multiple-pipelined Multiply-Accumulate (MAC) modules and pipelined data path and buffer structure for fetching and processing input/weight data from external memory to accelerate DNN operations [10][11][12][13][14][15][16][17].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

An Integrated Analysis Framework of Convolutional Neural Network for Embedded Edge Devices

Lim

Kang

et al. 2022

Electronics

View full text Add to dashboard Cite

Recently, IoT applications using Deep Neural Network (DNN) to embedded edge devices are increasing. Generally, in the case of DNN applications in the IoT system, training is mainly performed in the server and inference operation is performed on the edge device. The embedded edge devices still take a lot of loads in inference operations due to low computing resources, so proper customization of DNN with architectural exploration is required. However, there are few integrated frameworks to facilitate exploration and customization of various DNN models and their operations in embedded edge devices. In this paper, we propose an integrated framework that can explore and customize DNN inference operations of DNN models on embedded edge devices. The framework consists of the GUI interface part, the inference engine part, and the hardware Deep Learning Accelerator (DLA) Virtual Platform (VP) part. Specifically it focuses on Convolutional Neural Network (CNN), and provides integrated interoperability for convolutional neural network models and neural network customization techniques such as quantization and cross-inference functions. In addition, performance estimation is possible by providing hardware DLA VP for embedded edge devices. Those features are provided as web-based GUI interfaces, so users can easily utilize them.

show abstract