2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00204
|View full text |Cite
|
Sign up to set email alerts
|

Towards Unified INT8 Training for Convolutional Neural Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
79
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 124 publications
(79 citation statements)
references
References 25 publications
0
79
0
Order By: Relevance
“…However, the SOTA neural network models suffer massive parameters and large sizes to achieve good performance in different tasks, which also cause significant complex computation and great resource consumption. To compress and accelerate the deep CNNs, many approaches have been proposed, which can be classified into five categories: transferred/compact convolutional filters [89,85,78]; quantization/binarization [35,11,82,92]; knowledge distillation [12,86,16]; pruning [28,31,22]; low-rank factorization [46,38,47,79].…”
Section: Related Workmentioning
confidence: 99%
“…However, the SOTA neural network models suffer massive parameters and large sizes to achieve good performance in different tasks, which also cause significant complex computation and great resource consumption. To compress and accelerate the deep CNNs, many approaches have been proposed, which can be classified into five categories: transferred/compact convolutional filters [89,85,78]; quantization/binarization [35,11,82,92]; knowledge distillation [12,86,16]; pruning [28,31,22]; low-rank factorization [46,38,47,79].…”
Section: Related Workmentioning
confidence: 99%
“…It yields more efficient inference since all the quantization parameters are known in advance and fixed. Several of the most common range estimators include: current min-max or simply min-max, uses the full dynamic range of the tensor (Zhou et al, 2016;Wu et al, 2018b;Zhu et al, 2020);…”
Section: Background and Related Workmentioning
confidence: 99%
“…Due to the resource-constrained environment of the edge devices, a high-performance on-device learning system needs to reduce the resource cost as much as possible, including the hardware-level computation overhead, communication-level network traffic and energy-level consumption. Overall, resource saving is one of the most essential demands to deploy on-device learning [49]. Personalized Model.…”
Section: B On-device Learningmentioning
confidence: 99%
“…Therefore, the gradients of activation and weights of current layer are also calculated as INT8. After obtaining the full gradients of this layer, we need to dequantize them into FP32 format and update the model parameters in full precision for better optimization accuracy [49]. Therefore, a full-precision copy of the original weights and activations is needed for conducting the model updating in the FP32 data type.…”
Section: B Low-precision Data Representationmentioning
confidence: 99%