2020
DOI: 10.1007/978-3-030-58583-9_6
|View full text |Cite
|
Sign up to set email alerts
|

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…Quantization Reducing the complexity of CNNs via model quantization in the absence of any adversary is a well studied problem in the deep learning literature [32,33,2,46,15,27,3]. The role of quantization on adversarial robustness was studied in Defensive Quantization (DQ) [20] where it was observed that conventional post-training fixed-point quantization makes networks more vulnerable to adversarial perturbations than their full-precision counterparts.…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Quantization Reducing the complexity of CNNs via model quantization in the absence of any adversary is a well studied problem in the deep learning literature [32,33,2,46,15,27,3]. The role of quantization on adversarial robustness was studied in Defensive Quantization (DQ) [20] where it was observed that conventional post-training fixed-point quantization makes networks more vulnerable to adversarial perturbations than their full-precision counterparts.…”
Section: Background and Related Workmentioning
confidence: 99%
“…We measure the throughput in FPS by mapping the networks onto an NVIDIA Jetson Xavier via native PyTorch [25] commands. We experiment with VGG-16 [38], ResNet-18 3 [12], ResNet-50, and WideResNet-28-4 [45] network architectures, and report both natural accuracy (A nat ) and robust accuracy (A rob ). Following standard procedure, we report A rob against ∞ bounded perturbations generated via PGD [22] with standard attack strengths: = 8/255 with PGD-100 for both CIFAR-10 [18] and SVHN [24] datasets, and = 4/255 with PGD-50 for the ImageNet [31] dataset.…”
Section: Evaluation Setupmentioning
confidence: 99%
“…Therefore, many recent works focus on building resource-efficient deep neural networks to bridge the gap between the scale of deep neural networks and actual permissible computational complexity/memory-bounds for on-device model deployments. Some of these works consider designing computation-and memoryefficient modules for neural architectures, while others focus on compressing a given neural network by either pruning its weights [7,12,19,36] or reducing the bits used to represent the weights and activations [3,8,18]. The last approach, neural network quantization, is beneficial for building ondevice AI systems since the edge devices oftentimes only support low bitwidth-precision parameters and/or operations.…”
Section: Introductionmentioning
confidence: 99%
“…As shown in Figure 1 Right, although BRECQ [18] addresses the problem by considering the dependency between filters in each block, it is limited to the Post-Training Quantization (PTQ) problem, which suffers from inevitable information loss, resulting in inferior performance. The most recent Quantization-Aware Training (QAT) methods [8,21] are concerned with obtaining quantized weights by minimizing quantization losses with parameterized activation functions, disregarding cross-layer weight dependencies during training process. To the best of our knowledge, no prior work explicitly considers dependencies among the weights for QAT.…”
Section: Introductionmentioning
confidence: 99%
“…The second is to reduce the size of neural networks so that their inference latencies are low enough to handle real-time inputs [3,4,5,6,7,8]. There are numerous methods to reduce the size of neural networks for different platforms, among which are CPUs, GPUs, and FPGAs.…”
Section: Introductionmentioning
confidence: 99%