Transform Quantization for CNN Compression

Young, Sean I.; Wang, Zhe; Taubman, David; Girod, Bernd

doi:10.1109/tpami.2021.3084839

Cited by 40 publications

(21 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We now present a lower bound for the asymptotic minimax risk (4) in estimating θ from the linear model (2). The proof sketch in this section is outlined via Lemmas 4.1, 4.2, and 4.3 and build up to our proposed lower bound in Thm.…”

Section: Lower Bound For the Minimax Riskmentioning

confidence: 99%

“…Model compression techniques employ lossless and lossy source coding schemes to quantize the model parameters subject to finite precision constraints in order to make them deployable on memory-constrained devices [2,3]. In this work, we adopt an information-theoretic approach to the problem of quantizing linear models without any distributional assumptions on the data or the true model.…”

Section: Significance and Related Workmentioning

confidence: 99%

“…For a learning code Q, the mutual information between the data X (input) and Q(X) (output) is a reasonable proxy for the number of bits B used to design the corresponding encoder-decoder pair (E, D); in other words, the number of bits required for representing the estimate of θ. The marginal distribution of X is determined by W, θ, and v according to (2), and the distribution of the output of the learning code θ ≡ Q(X) is determined solely by the marginal of X and the conditional distribution of θ given X. This idea follows from the information-rate distortion function in rate distortion theory literature [12][Ch.…”

Section: Lower Bound For the Minimax Riskmentioning

confidence: 99%

See 2 more Smart Citations

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

Saha¹,

Pilancı²,

Goldsmith³

2022

Preprint

View full text Add to dashboard Cite

We consider the problem of quantizing a linear model learned from measurements X = Wθ + v. The model is constrained to be representable using only dB-bits, where B ∈ (0, ∞) is a pre-specified budget and d is the dimension of the model. We derive an information-theoretic lower bound for the minimax risk under this setting and show that it is tight with a matching upper bound. This upper bound is achieved using randomized embedding based algorithms. We propose randomized Hadamard embeddings that are computationally efficient while performing near-optimally. We also show that our method and upper-bounds can be extended for two-layer ReLU neural networks. Numerical simulations validate our theoretical claims.

show abstract

Section: Lower Bound For the Minimax Riskmentioning

confidence: 99%

Section: Significance and Related Workmentioning

confidence: 99%

Section: Lower Bound For the Minimax Riskmentioning

confidence: 99%

See 1 more Smart Citation

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

Saha¹,

Pilancı²,

Goldsmith³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…More recently, an iterative pruning and retraining algorithm to further reduce the size of deep models was proposed in [9,21]. The method of network quantization or weight sharing, i.e., employing a clustering algorithm to group the weights in a neural network, and its variants, including vector quantization [22], soft quantization [23,24], fixed point quantization [25], transform quantization [26], and Hessian weighted quantization [11], have been extensively investigated. Matrix factorization, where low-rank approximation of the weights in neural networks is used instead of the original weight matrix, has also been widely studied in [27][28][29].…”

Section: Related Workmentioning

confidence: 99%

“…We consider the following two compression algorithms. The first one is the conditional distribution P Ŵ|W in the proof of achievability (26), which requires the knowledge of w * and is denoted as "Oracle". The second one is the well-known K-means clustering algorithm, where the weights in W are grouped into K clusters and represented by the cluster centers in the reconstruction Ŵ.…”

Section: Evaluation and Visualizationmentioning

confidence: 99%

Population Risk Improvement with Model Compression: An Information-Theoretic Approach

Gao

Zou

et al. 2021

Entropy

View full text Add to dashboard Cite

It has been reported in many recent works on deep model compression that the population risk of a compressed model can be even better than that of the original model. In this paper, an information-theoretic explanation for this population risk improvement phenomenon is provided by jointly studying the decrease in the generalization error and the increase in the empirical risk that results from model compression. It is first shown that model compression reduces an information-theoretic bound on the generalization error, which suggests that model compression can be interpreted as a regularization technique to avoid overfitting. The increase in empirical risk caused by model compression is then characterized using rate distortion theory. These results imply that the overall population risk could be improved by model compression if the decrease in generalization error exceeds the increase in empirical risk. A linear regression example is presented to demonstrate that such a decrease in population risk due to model compression is indeed possible. Our theoretical results further suggest a way to improve a widely used model compression algorithm, i.e., Hessian-weighted K-means clustering, by regularizing the distance between the clustering centers. Experiments with neural networks are provided to validate our theoretical assertions.

show abstract

Communication-Efficient Federated Learning with Multi-layered Compressed Model Update and Dynamic Weighting Aggregation

Zhong

Liu

2021

Artificial Intelligence

View full text Add to dashboard Cite

Transform Quantization for CNN Compression

Cited by 40 publications

References 52 publications

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

Population Risk Improvement with Model Compression: An Information-Theoretic Approach

Communication-Efficient Federated Learning with Multi-layered Compressed Model Update and Dynamic Weighting Aggregation

Contact Info

Product

Resources

About