Post-Training BatchNorm Recalibration

Shomron, Gil; Weiser, Uri

doi:10.48550/arxiv.2010.05625

Cited by 2 publications

(2 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[42] recomputes population variance to compensate for the "variance shift" caused by the inference mode of dropout. [64,31,69] recompute / recalibrate population statistics to compensate for the distribution shift caused by test-time quantization. Recomputing population statistics on a different domain [43] is common in domain adaptation, which will be discussed further in Sec.…”

Section: Discussion and Related Workmentioning

confidence: 99%

Rethinking "Batch" in BatchNorm

Wu,

Johnson

2021

Preprint

View full text Add to dashboard Cite

BatchNorm is a critical building block in modern convolutional neural networks. Its unique property of operating on "batches" instead of individual samples introduces significantly different behaviors from most other operations in deep learning. As a result, it leads to many hidden caveats that can negatively impact model's performance in subtle ways. This paper thoroughly reviews such problems in visual recognition tasks, and shows that a key to address them is to rethink different choices in the concept of "batch" in BatchNorm. By presenting these caveats and their mitigations, we hope this review can help researchers use Batch-Norm more effectively.

show abstract

Section: Discussion and Related Workmentioning

confidence: 99%

Rethinking "Batch" in BatchNorm

Wu,

Johnson

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The min-max statistics are gathered during a quick preprocessing stage on 2K randomly picked images from the training set. In addition, during preprocessing, we recalibrate the BatchNorm layers' running mean and running variance statistics [25,29,31,32]. In all models, the first convolution layer is left intact, since its input activations, which correspond to the image pixels, do not include many zero values, if any.…”

Section: Methodsmentioning

confidence: 99%

Post-Training Sparsity-Aware Quantization

Shomron¹,

Gabbay²,

Kurzum³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Quantization is a technique used in deep neural networks (DNNs) to increase execution performance and hardware efficiency. Uniform post-training quantization (PTQ) methods are common, since they can be implemented efficiently in hardware and do not require extensive hardware resources or a training set. Mapping FP32 models to INT8 using uniform PTQ yields models with negligible accuracy degradation; however, reducing precision below 8 bits with PTQ is challenging, as accuracy degradation becomes noticeable, due to the increase in quantization noise. In this paper, we propose a sparsity-aware quantization (SPARQ) method, in which the unstructured and dynamic activation sparsity is leveraged in different representation granularities. 4-bit quantization, for example, is employed by dynamically examining the bits of 8-bit values and choosing a window of 4 bits, while first skipping zero-value bits. Moreover, instead of quantizing activation-by-activation to 4 bits, we focus on pairs of 8-bit activations and examine whether one of the two is equal to zero. If one is equal to zero, the second can opportunistically use the other's 4-bit budget; if both do not equal zero, then each is dynamically quantized to 4 bits, as described. SPARQ achieves minor accuracy degradation, 2× speedup over widely used hardware architectures, and a practical hardware implementation. The code is available at https://github.com/gilshm/sparq.Preprint. Under review.

show abstract

Post-Training BatchNorm Recalibration

Cited by 2 publications

References 18 publications

Rethinking "Batch" in BatchNorm

Rethinking "Batch" in BatchNorm

Post-Training Sparsity-Aware Quantization

Contact Info

Product

Resources

About