Quantized CNN: A Unified Approach to Accelerate and Compress Convolutional Networks

Cheng, Jian; Wu, Jiaxiang; Leng, Cong; Wang, Yuhang; Hu, Qinghao

doi:10.1109/tnnls.2017.2774288

Cited by 107 publications

(63 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…. , C y andW cy is the c y -th column vector ofW and C x = C y = 1024 is selected for f (14) (·). In (10), {f (i) (·)} i={2,5,8,11} represent the convolutional layers.…”

Section: Network Architecture and Trainingmentioning

confidence: 99%

“…where d x × d y is the size of the convolutional kernel, and V x × V y are the size of the response of a convolutional layer.W vy,v k ∈ R Vx denotes the weights of the v y -th convolutional kernel, andX px ∈ R Vx is the input feature map at spatial position p x . Hence we define p x and p k as the 2-D spatial positions in the feature maps and convolutional kernels, respectively [13], [14]. In the proposed architecture, we use 256 filters, the first two of which are of size 5 × 5 and the remaining two have 3 × 3 filters.…”

Section: Network Architecture and Trainingmentioning

confidence: 99%

See 1 more Smart Citation

DeepMUSIC: Multiple Signal Classification via Deep Learning

Elbir

2020

IEEE Sens. Lett.

134

View full text Add to dashboard Cite

This letter introduces a deep learning (DL) framework for direction-of-arrival (DOA) estimation. Previous works in DL context mostly consider a single or two target scenario which is a strong limitation in practice. Hence, in this work, we propose a DL framework for multiple signal classification (DeepMUSIC). We design multiple deep convolutional neural networks (CNNs), each of which is dedicated to a subregion of the angular spectrum. In particular, each CNN is fed with the array covariance matrix and it learns the MUSIC spectra of the corresponding angular subregion. We have shown, through simulations, that the proposed DeepMUSIC framework has superior estimation accuracy and exhibits less computational complexity in comparison with both DL and non-DL based techniques.

show abstract

“…. , C y andW cy is the c y -th column vector ofW and C x = C y = 1024 is selected for f (14) (·). In (10), {f (i) (·)} i={2,5,8,11} represent the convolutional layers.…”

Section: Network Architecture and Trainingmentioning

confidence: 99%

Section: Network Architecture and Trainingmentioning

confidence: 99%

DeepMUSIC: Multiple Signal Classification via Deep Learning

Elbir

2020

IEEE Sens. Lett.

134

View full text Add to dashboard Cite

show abstract

“…In order to accelerate inference and compress the size of DNN models, many network quantization methods are proposed. Some studies focus on scalar and vector quantization [4,7], while others center on fixed-point quantization [18,19].…”

Section: Neural Network Quantizationmentioning

confidence: 99%

“…Recently, many neural network quantization methods have been proposed. Gong Y. et al [7] and Cheng J. et al [4] explored scalar and vector quantization methods for compressing DNNs. Zhou A. et al [18], Zhou S. et al [19] proposed fixed-point quantization methods.…”

Section: Related Workmentioning

confidence: 99%

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

Liu

Wang

et al. 2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless network. In this paper, we demonstrate the advantages of the cloud-edge collaborative inference with quantization. By analyzing the characteristics of layers in DNNs, an auto-tuning neural network quantization framework for collaborative inference is proposed. We study the effectiveness of mixed-precision collaborative inference of state-of-the-art DNNs by using ImageNet dataset. The experimental results show that our framework can generate reasonable network partitions and reduce the storage on mobile devices with trivial loss of accuracy.

show abstract

“…Recently, a variety of CNN compression methods have been proposed to tackle the aforementioned issues such as quantization [9], [10], weight and feature approximation [11], encoding [12], approximation [13], and pruning [14], [15]. Wherein, weight pruning based methods achieve the highest compression performance since there are considerable subtle weights in most of pre-trained CNNs.…”

Section: Introductionmentioning

confidence: 99%

Learning Student Networks via Feature Embedding

Chen

Wang

et al. 2021

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Deep convolutional neural networks have been widely used in numerous applications, but their demanding storage and computational resource requirements prevent their applications on mobile devices. Knowledge distillation aims to optimize a portable student network by taking the knowledge from a well-trained heavy teacher network. Traditional teacher-student based methods used to rely on additional fully-connected layers to bridge intermediate layers of teacher and student networks, which brings in a large number of auxiliary parameters. In contrast, this paper aims to propagate information from teacher to student without introducing new variables which need to be optimized. We regard the teacher-student paradigm from a new perspective of feature embedding. By introducing the locality preserving loss, the student network is encouraged to generate the low-dimensional features which could inherit intrinsic properties of their corresponding high-dimensional features from teacher network. The resulting portable network thus can naturally maintain the performance as that of the teacher network. Theoretical analysis is provided to justify the lower computation complexity of the proposed method. Experiments on benchmark datasets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.

show abstract

Quantized CNN: A Unified Approach to Accelerate and Compress Convolutional Networks

Cited by 107 publications

References 11 publications

DeepMUSIC: Multiple Signal Classification via Deep Learning

DeepMUSIC: Multiple Signal Classification via Deep Learning

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

Learning Student Networks via Feature Embedding

Contact Info

Product

Resources

About