Afshin Abdi scite author profile

Afshin Abdi

5Publications

86Citation Statements Received

84Citation Statements Given

How they've been cited

How they cite others

Affiliations

Georgia Institute of Technology, University of Sciences and Technology Houari Boumediene, University of Science and Technology

Publications

Order By: Most citations

Fast Convex Pruning of Deep Neural Networks

Aghasi¹,

Abdi²,

Romberg³

2020

SIAM Journal on Mathematics of Data Science

View full text Add to dashboard Cite

We develop a fast, tractable technique called Net-Trim for simplifying a trained neural network. The method is a convex post-processing module, which prunes (sparsifies) a trained network layer by layer, while preserving the internal responses. We present a comprehensive analysis of Net-Trim from both the algorithmic and sample complexity standpoints, centered on a fast, scalable convex optimization program. Our analysis includes consistency results between the initial and retrained models before and after Net-Trim application and guarantees on the number of training samples needed to discover a network that can be expressed using a certain number of nonzero terms. Specifically, if there is a set of weights that uses at most s terms that can re-create the layer outputs from the layer inputs, we can find these weights from O(s log N/s) samples, where N is the input size. These theoretical results are similar to those for sparse regression using the Lasso, and our analysis uses some of the same recently-developed tools (namely recent results on the concentration of measure and convex analysis). Finally, we propose an algorithmic framework based on the alternating direction method of multipliers (ADMM), which allows a fast and simple implementation of Net-Trim for network pruning and compression. Ps log(N/s).We also show that if the x p are subgaussian, then so are the y p . As a results, the theory can be applied layer-by-layer, yielding a sampling result for networks of arbitrary depth. (When we apply the algorithm in practice, the equality constraints in (1) are relaxed; this is discussed in detail in Section 3.1.) Along with these theoretical guarantees, Net-Trim offers state-of-the-art performance on realistic networks. In Section 6, we present some numerical experiments that show that compression factors between 10x and 50x (removing 90% to 98% of the connections) are possible with very little loss in test accuracy.Contributions and relations to previous work This paper provides a full description of the Net-Trim method from both a theoretical and algorithmic perspective. In Section 3, we present our convex formulation for sparsifying the weights in the linear layers of a network; we describe how the procedure can be applied layer-by-layer in a deep network either in parallel or serially (cascading the results), and present consistency bounds for both approaches. Section 4 presents our main theoretical result, stated precisely in Theorem 4. This result derives an upper bound on the number of data samples we need to reliably discover a layer that has at most s connections in its linear layer -we show that if the data samples are random, then these weights can be learned from O(s log N/s) samples. Mathematically, this result is comparable to the sample complexity bounds for the Lasso in performing sparse regression on a linear model (also known as the compressed sensing problem). Our analysis is based on the bowling scheme [30,24]; the main technical challenges are adapting this technique to the piecewise linear...

show abstract

Quantized Compressive Sampling of Stochastic Gradients for Efficient Communication in Distributed Deep Learning

Abdi

Fekri

2020

AAAI

View full text Add to dashboard Cite

In distributed training of deep models, the transmission volume of stochastic gradients (SG) imposes a bottleneck in scaling up the number of processing nodes. On the other hand, the existing methods for compression of SGs have two major drawbacks. First, due to the increase in the overall variance of the compressed SG, the hyperparameters of the learning algorithm must be readjusted to ensure the convergence of the training. Further, the convergence rate of the resulting algorithm still would be adversely affected. Second, for those approaches for which the compressed SG values are biased, there is no guarantee for the learning convergence and thus an error feedback is often required. We propose Quantized Compressive Sampling (QCS) of SG that addresses the above two issues while achieving an arbitrarily large compression gain. We introduce two variants of the algorithm: Unbiased-QCS and MMSE-QCS and show their superior performance w.r.t. other approaches. Specifically, we show that for the same number of communication bits, the convergence rate is improved by a factor of 2 relative to state of the art. Next, we propose to improve the convergence rate of the distributed training algorithm via a weighted error feedback. Specifically, we develop and analyze a method to both control the overall variance of the compressed SG and prevent the staleness of the updates. Finally, through simulations, we validate our theoretical results and establish the superior performance of the proposed SG compression in the distributed training of deep models. Our simulations also demonstrate that our proposed compression method expands substantially the region of step-size values for which the learning algorithm converges.

show abstract

Analog Compression and Communication for Federated Learning over Wireless MAC

Abdi

Saidutta

Fekri

2020

View full text Add to dashboard Cite

Joint Source-Channel Coding Over Additive Noise Analog Channels Using Mixture of Variational Autoencoders

Saidutta¹,

Abdi²,

Fekri³

2021

IEEE J. Select. Areas Commun.

View full text Add to dashboard Cite

Reducing Communication Overhead via CEO in Distributed Training

Abdi

Fekri

2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Afshin Abdi

Fast Convex Pruning of Deep Neural Networks

Quantized Compressive Sampling of Stochastic Gradients for Efficient Communication in Distributed Deep Learning

Analog Compression and Communication for Federated Learning over Wireless MAC

Joint Source-Channel Coding Over Additive Noise Analog Channels Using Mixture of Variational Autoencoders

Reducing Communication Overhead via CEO in Distributed Training

Contact Info

Product

Resources

About