2018
DOI: 10.48550/arxiv.1810.01875
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Relaxed Quantization for Discretized Neural Networks

Abstract: Neural network quantization has become an important research area due to its great impact on deployment of large models on resource constrained devices. In order to train networks that can be effectively discretized without loss of performance, we introduce a differentiable quantization procedure. Differentiability can be achieved by transforming continuous distributions over the weights and activations of the network to categorical distributions over the quantization grid. These are subsequently relaxed to co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(28 citation statements)
references
References 20 publications
0
28
0
Order By: Relevance
“…The proposed FAT is built on Pytorch framework. We compare FAT with state-of-the-art approaches, including WAGE [40], LQ-Net [43], PACT [7], RQ [20], UNIQ [3], DQ [35], BCGD [2] [35], DSQ [10], QIL [13], HAQ [36], APoT [17], HMQ [11] DJPQ [38], LSQ [8].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed FAT is built on Pytorch framework. We compare FAT with state-of-the-art approaches, including WAGE [40], LQ-Net [43], PACT [7], RQ [20], UNIQ [3], DQ [35], BCGD [2] [35], DSQ [10], QIL [13], HAQ [36], APoT [17], HMQ [11] DJPQ [38], LSQ [8].…”
Section: Methodsmentioning
confidence: 99%
“…As shown in Table 2, we compare all methods appeared in the main paper, including WAGE [40], LQ-Net [43], PACT [7], RQ [20], UNIQ [3], DQ [35], BCGD [2] [35], DSQ [10], QIL [13], HAQ [36], APoT [17], [11] DJPQ [38], LSQ [8].…”
Section: Categorization Of Quantization Methodsmentioning
confidence: 99%
“…In this section, we compare our GMPQ with the stateof-the-art fixed-precision models containing APoT [25] and RQ [31] and mixed-precision networks including ALQ [38], HAWQ [9], EdMIPS [3], HAQ [50], BP-NAS [56], HMQ [13] and DQ [47] on ImageNet for image classification and on PASCAL VOC for object detection. We also provide the performance of full-precision models for reference.…”
Section: Comparison With State-of-the-art Methodsmentioning
confidence: 99%
“…For both of the MNIST models, we found that letting each subcomponent of F be a simple dimensionwise scalar affine transform (similar to f dense in figure 3), was sufficient. Since each φ is quantized to integers, having a flexible scale and shift leads to flexible SQ, similar to in (Louizos, Reisser, et al, 2018). Due to the small size of the networks, more complex transformation functions lead to too much overhead.…”
Section: Mnist Experimentsmentioning
confidence: 99%
“…While these technically have a finite (but large) number of states, the best results in terms of both accuracy and bit rate are typically achieved for a significantly reduced number of states. Existing approaches to model compression often acknowledge this by quantizing each individual linear filter coefficient in an ANN to a small number of pre-determined values (Louizos, Reisser, et al, 2018;Baskin et al, 2018;F. Li et al, 2016).…”
Section: Introductionmentioning
confidence: 99%