Artifact Digital Object Group 2018
DOI: 10.1145/3229769
|View full text |Cite
|
Sign up to set email alerts
|

Collective Knowledge Workflow for Highly Efficient 8-bit Low Precision Inference of Convolutional Neural Networks with IntelCaffe

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…The influence of parameters quantization on the NN model performance is an active area of research, where the NN parameters have been quantized with 16-bits [20,32], 8-bits [1,9], 4-bits [3] or 2-bits [4]. Moreover, ternary [34] and binary (1-bit) quantization [10,22] have also been taken into account.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The influence of parameters quantization on the NN model performance is an active area of research, where the NN parameters have been quantized with 16-bits [20,32], 8-bits [1,9], 4-bits [3] or 2-bits [4]. Moreover, ternary [34] and binary (1-bit) quantization [10,22] have also been taken into account.…”
Section: Introductionmentioning
confidence: 99%
“…The main advantage of USQ is the design simplicity accompanied with relatively good performance when compared to more complex non-uniform quantization. Nevertheless, a detailed design process of the quantizer, taking into account the assumed statistical distribution of NN parameters, is missing in above mentioned papers [1,4,9,10,20,32,34] about quantization of NN parameters. In this paper we design USQ for compression of NN weights assuming Laplacian distribution of weights and bit rates from 9 to 16 bps.…”
Section: Introductionmentioning
confidence: 99%