2021 IEEE International Conference on Consumer Electronics (ICCE) 2021
DOI: 10.1109/icce50685.2021.9427656
|View full text |Cite
|
Sign up to set email alerts
|

Filter-Wise Quantization of Deep Neural Networks for IoT Devices

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 3 publications
0
2
0
Order By: Relevance
“…There have been several studies to reduce the complexity of neural networks by pruning unimportant network connections; quantizing and applying Huffman conding [2]; using vector quantization to CNN [3]; studying of quantization, coding, pruning, and sharing techniques for image instance retrieval [4]; or by applying filter-wise quantization [5]. Some studies tried to optimize network model architecture by applying complex-optimized computation methods [6,7].…”
Section: Optimization and Acceleration For Embedded Edge Devicesmentioning
confidence: 99%
See 1 more Smart Citation
“…There have been several studies to reduce the complexity of neural networks by pruning unimportant network connections; quantizing and applying Huffman conding [2]; using vector quantization to CNN [3]; studying of quantization, coding, pruning, and sharing techniques for image instance retrieval [4]; or by applying filter-wise quantization [5]. Some studies tried to optimize network model architecture by applying complex-optimized computation methods [6,7].…”
Section: Optimization and Acceleration For Embedded Edge Devicesmentioning
confidence: 99%
“…There are two approaches to efficiently optimize DNN in edge devices: optimizing the network at software algorithm level, and using a hardware Deep Learning Accelerator (DLA), and these two approaches are mixed. For software optimization, a lightweight technique to reduce the complexity of the DNN network has been studied such as network layer compression, weight pruning, and filter quantization methods [2][3][4][5][6][7][8][9]. The hardware Deep Learning Accelerator has multiple-pipelined Multiply-Accumulate (MAC) modules and pipelined data path and buffer structure for fetching and processing input/weight data from external memory to accelerate DNN operations [10][11][12][13][14][15][16][17].…”
Section: Introductionmentioning
confidence: 99%