True Gradient-Based Training of Deep Binary Activated Neural Networks Via Continuous Binarization

Sakr, Charbel; Choi, Jungwook; Wang, Zhuo; Gopalakrishnan, Kailash; Shanbhag, Naresh R.

doi:10.1109/icassp.2018.8461456

Cited by 24 publications

(15 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, because of the nondifferentiable quantizer, some literature focuses on relaxing the discrete optimization problem. A typical approach is to train with regularization [13,49,2,1,33,8], where the optimization problem becomes continuous while gradually adjusting the data distribution towards quantized values. Apart from the two challenges, with the popularization of neural architecture search (NAS), Wang et al [38] further propose to employ reinforcement learning to automatically determine the bit-width of each layer without human heuristics.…”

Section: Related Workmentioning

confidence: 99%

“…In particular, [9,15,28] set the foundations for 1-bit quantization, while [16,50] for arbitrary bitwidth quantization. Progressive quantization [2,1,53,33], loss aware-quantization [13,49], improved gradient estimators for non-differentiable functions [21] and RL-aided training [20], have focused on improved training schemes, while mixed precision quantization [36], hardware-aware quantization [37] and architecture search for quantized models [34] have focused on alternatives for standard quantized models. However, these strategies are exclusively focused on improving the performance and efficiency of static networks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Switchable Precision Neural Networks

Guerra,

Zhuang,

Reid

et al. 2020

Preprint

View full text Add to dashboard Cite

Instantaneous and on demand accuracy-efficiency tradeoff has been recently explored in the context of neural networks slimming. In this paper, we propose a flexible quantization strategy, termed Switchable Precision neural Networks (SP-Nets), to train a shared network capable of operating at multiple quantization levels. At runtime, the network can adjust its precision on the fly according to instant memory, latency, power consumption and accuracy demands. For example, by constraining the network weights to 1-bit with switchable precision activations, our shared network spans from BinaryConnect to Binarized Neural Network, allowing to perform dot-products using only summations or bit operations. In addition, a self-distillation scheme is proposed to increase the performance of the quantized switches. We tested our approach with three different quantizers and demonstrate the performance of SP-Nets against independently trained quantized models in classification accuracy for Tiny ImageNet and ImageNet datasets using ResNet-18 and MobileNet architectures.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Switchable Precision Neural Networks

Guerra,

Zhuang,

Reid

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Continuous binarization Few recent work proposed to use a continuous activation function that increasingly resembles a binary activation function during training, thereby eliminating approximation process across activation function. Sakr et al (2018) used a piecewise linear function of which the slope gradually increases, while and Gong et al (2019) proposed to use sigmoid and tanh function, respectively.…”

Section: Sophisticated Stesmentioning

confidence: 99%

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

Kim,

Kim

et al. 2020

Preprint

View full text Add to dashboard Cite

Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings. However, BNNs suffer from performance degradation mainly due to the gradient mismatch caused by binarizing activations. Previous works tried to address the gradient mismatch problem by reducing the discrepancy between activation functions used at forward pass and its differentiable approximation used at backward pass, which is an indirect measure. In this work, we use the gradient of smoothed loss function to better estimate the gradient mismatch in quantized neural network. Analysis using the gradient mismatch estimator indicates that using higher precision for activation is more effective than modifying the differentiable approximation of activation function. Based on the observation, we propose a new training scheme for binary activation networks called BinaryDuo in which two binary activations are coupled into a ternary activation during training. Experimental results show that BinaryDuo outperforms state-of-the-art BNNs on various benchmarks with the same amount of parameters and computing cost.

show abstract

“…Reducing DNN complexity via quantization has been an active area of research over the past few years. A majority of such works either train the quantized network from scratch [36,33,16,10,22,24] or fine-tune a pre-trained model with quantization-in-the-loop [13,19,30,32,1,35]. Where retraining is not an option, [25] provides analytical guarantees on the minimum precision requirements of a pre-trained FP network given a budget on the accuracy drop from FP.…”

Section: Related Workmentioning

confidence: 99%

“…Where retraining is not an option, [25] provides analytical guarantees on the minimum precision requirements of a pre-trained FP network given a budget on the accuracy drop from FP. Training based quantization works fall into two classes of methods: 1) estimation based methods [33,18,16,30,13], where the full-precision weights and activations are quantized in the forward path, and gradients are back-propagated through a nondifferentiable quantizer function via a gradient estimator such as the Straight Through Estimator (STE) [2]; and 2) optimization based methods, where gradients flow directly from the full-precision weights to the cost function via an approximate differentiable quantizer [32,19,24], or by including an explicit quantization error term to the loss function [7,35]. Application of these methods can be categorized into three clusters: Aggressive Quantization: Methods such as binarization and ternarization have been highly successful for reducing DNN complexity.…”

Section: Related Workmentioning

confidence: 99%

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Dbouk¹,

Sanghvi²,

Mehendale³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Deep neural networks have achieved state-of-the art performance on various computer vision tasks. However, their deployment on resource-constrained devices has been hindered due to their high computational and storage complexity. While various complexity reduction techniques, such as lightweight network architecture design and parameter quantization, have been successful in reducing the cost of implementing these networks, these methods have often been considered orthogonal. In reality, existing quantization techniques fail to replicate their success on lightweight architectures such as MobileNet. To this end, we present a novel fully differentiable non-uniform quantizer that can be seamlessly mapped onto efficient ternary-based dot product engines. We conduct comprehensive experiments on CIFAR-10, ImageNet, and Visual Wake Words datasets. The proposed quantizer (DBQ) successfully tackles the daunting task of aggressively quantizing lightweight networks such as MobileNetV1, MobileNetV2, and ShuffleNetV2. DBQ achieves state-ofthe art results with minimal training overhead and provides the best (pareto-optimal) accuracy-complexity trade-off.

show abstract

True Gradient-Based Training of Deep Binary Activated Neural Networks Via Continuous Binarization

Cited by 24 publications

References 4 publications

Switchable Precision Neural Networks

Switchable Precision Neural Networks

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks

Contact Info

Product

Resources

About