2022
DOI: 10.48550/arxiv.2203.05025
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Power-of-Two Quantization for Low Bitwidth and Hardware Compliant Neural Networks

Abstract: Deploying Deep Neural Networks in low-power embedded devices for real time-constrained applications requires optimization of memory and computational complexity of the networks, usually by quantizing the weights. Most of the existing works employ linear quantization which causes considerable degradation in accuracy for weight bit widths lower than 8. Since the distribution of weights is usually non-uniform (with most weights concentrated around zero), other methods, such as logarithmic quantization, are more s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(11 citation statements)
references
References 5 publications
0
8
0
Order By: Relevance
“…The PoT quantization is a logarithmic quantizer [87] designed to approximate the weights to the closest power of two in the range defined by the considered number of bits. Mathematically, we can represent the PoT quantization considering 2 BW elements as [84], [87], [88]:…”
Section: Quantizationmentioning
confidence: 99%
See 2 more Smart Citations
“…The PoT quantization is a logarithmic quantizer [87] designed to approximate the weights to the closest power of two in the range defined by the considered number of bits. Mathematically, we can represent the PoT quantization considering 2 BW elements as [84], [87], [88]:…”
Section: Quantizationmentioning
confidence: 99%
“…The latter is done by assuming that ∂W qc ∂W = 1. As stated in [88], this process is known as a Straight-Through Estimator, and it results in a smoother transition between consecutive quantization levels in the learning process.…”
Section: Quantizationmentioning
confidence: 99%
See 1 more Smart Citation
“…5 . In the case of Power-of two (PoT) quantization, we have: X = 0, because each multiplication costs just a shift [42], [53]. Lastly, for the Additive Powers-of-Two (APoT) quantization, we have: X = n, where n denotes the number of additive terms.…”
Section: A Dense Layermentioning
confidence: 99%
“…Since APoT is a quantization scheme represented by a sum of PoT terms, APoT provides a smooth transition between PoT and uniform quantization. In various works, PoT was claimed to have very low complexity because the multiplications are replaced by just shifts [53], [69], [70]. However, when we consider that the multiplication in the uniform quantization can be represented by shifts and adders, and we have a fair metric like NABS to compare between different quantization techniques, the NABS when applying PoT is only around an order of magnitude lower than the NABS when using the uniform quantization.…”
Section: Comparative Analysis Of the Complexities For Each Nn Structurementioning
confidence: 99%