Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Stewart, Robert; Nowlan, Andrew; Bacchus, Pascal; Ducasse, Q.; Komendantskaya, Ekaterina

doi:10.3390/electronics10040396

Cited by 14 publications

(2 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition, Han et al [20] added Huffman coding after quantization, which can further reduce the memory size and operation time of the model. Wu, Stewart and Wang et al [21][22][23] designed a new quantization framework for the hardware level, and provided different quantization strategies for different neural networks and hardware structures. Besides pruning and quantization, knowledge distillation is also an effective method of model compression.…”

Section: Introductionmentioning

confidence: 99%

A Novel Deep Learning Model Compression Algorithm

Zhao

Peng

et al. 2022

Electronics

View full text Add to dashboard Cite

In order to solve the problem of large model computing power consumption, this paper proposes a novel model compression algorithm. Firstly, this paper proposes an interpretable weight allocation method for the loss between a student network (a network model with poor performance), a teacher network (a network model with better performance) and real label. Then, different from the previous simple pruning and fine-tuning, this paper performs knowledge distillation on the pruned model, and quantifies the residual weights of the distilled model. The above operations can further reduce the model size and calculation cost while maintaining the model accuracy. The experimental results show that the weight allocation method proposed in this paper can allocate a relatively appropriate weight to the teacher network and real tags. On the cifar-10 dataset, the pruning method combining knowledge distillation and quantization can reduce the memory size of resnet32 network model from 3726 KB to 1842 KB, and the accuracy can be kept at 93.28%, higher than the original model. Compared with similar pruning algorithms, the model accuracy and operation speed are greatly improved.

show abstract

Section: Introductionmentioning

confidence: 99%

A Novel Deep Learning Model Compression Algorithm

Zhao

Peng

et al. 2022

Electronics

View full text Add to dashboard Cite

show abstract

“…However, with the advent of the Internet of Things, how to deploy high-performance DCNNs on embedded devices with limited hardware resources has become an urgent problem. To solve this problem, many model compression methods which reduce the model size and computational burden have been proposed, such as network quantization [7,8], model pruning [9], knowledge distillation [10,11], and lightweight model design [12].…”

Section: Introductionmentioning

confidence: 99%

IE-Net: Information-Enhanced Binary Neural Networks for Accurate Classification

2022

View full text Add to dashboard Cite

Binary neural networks (BNNs) have been proposed to reduce the heavy memory and computation burdens in deep neural networks. However, the binarized weights and activations in BNNs cause huge information loss, which leads to a severe accuracy decrease, and hinders the real-world applications of BNNs. To solve this problem, in this paper, we propose the information-enhanced network (IE-Net) to improve the performance of BNNs. Firstly, we design an information-enhanced binary convolution (IE-BC), which enriches the information of binary activations and boosts the representational power of the binary convolution. Secondly, we propose an information-enhanced estimator (IEE) to gradually approximate the sign function, which not only reduces the information loss caused by quantization error, but also retains the information of binary weights. Furthermore, by reducing the information loss in binary representations, the novel binary convolution and estimator gain large information compared with the previous work. The experimental results show that the IE-Net achieves accuracies of 88.5% (ResNet-20) and 61.4% (ResNet-18) on CIFAR-10 and ImageNet datasets respectively, which outperforms other SOTA methods. In conclusion, the performance of BNNs could be improved significantly with information enhancement on both weights and activations.

show abstract