Simultaneous Sparsity and Parameter Tying for Deep Learning Using Ordered Weighted ℓ<sub>1</sub> Regularization

Zhang, Dejiao; Katz-Samuels, Julian; Figueiredo, Mário A. T.; Balzano, Laura

doi:10.1109/ssp.2018.8450819

Cited by 2 publications

(1 citation statement)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, in [42] proximal gradient descent with group OWL constraint (grOWL [43]) is used to simultaneously sparsify neurons and enforce parameter sharing. Proximal gradient descent is also used in [44] where Ordered Weighted 1 regularization (OWL [45]) allowing simultaneous sparsify weights and optimized weight sharing. In [46], filters in CNN layers are pruned by solving an optimization problem using a dedicated optimizer with either group LASSO or 2,0 regularization.…”

Section: Learning Structured Sparse Dnns Using Proximal Regularization Methodsmentioning

confidence: 99%

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

Barlaud¹,

Guyard

2021

2020 25th International Conference on Pattern Recognition (ICPR)

View full text Add to dashboard Cite

Deep neural networks (DNN) have been applied recently to different domains and perform better than classical state-of-the-art methods. However the high level of performances of DNNs is most often obtained with networks containing millions of parameters and for which training requires substantial computational power. To deal with this computational issue proximal regularization methods have been proposed in the literature but they are time consuming. In this paper, we propose instead a constrained approach. We provide the general framework for this new projection gradient method. Our algorithm iterates a gradient step and a projection on convex constraints. We studied algorithms for different constraints: the classical 1 unstructured constraint and structured constraints such as the 2,1 constraint (Group LASSO). We propose a new 1,1 structured constraint for which we provide a new projection algorithm. Finally, we used the recent "Lottery optimizer" replacing the threshold by our 1,1 projection. We demonstrate the effectiveness of this method with three popular datasets (MNIST, Fashion MNIST and CIFAR). Experiments with these datasets show that our projection method using this new 1,1 structured constraint provides the best decrease in memory and computational power.

show abstract

Section: Learning Structured Sparse Dnns Using Proximal Regularization Methodsmentioning

confidence: 99%

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

Barlaud¹,

Guyard

2021

2020 25th International Conference on Pattern Recognition (ICPR)

View full text Add to dashboard Cite

show abstract

A Survey on Regularized Sparse Optimization Models and Algorithms

程

2023

AIRR

View full text Add to dashboard Cite

Simultaneous Sparsity and Parameter Tying for Deep Learning Using Ordered Weighted ℓ₁ Regularization

Cited by 2 publications

References 9 publications

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

A Survey on Regularized Sparse Optimization Models and Algorithms

Contact Info

Product

Resources

About

Simultaneous Sparsity and Parameter Tying for Deep Learning Using Ordered Weighted ℓ1 Regularization

Cited by 2 publications

References 9 publications

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

Learning sparse deep neural networks using efficient structured projections on convex constraints for green AI

A Survey on Regularized Sparse Optimization Models and Algorithms

Contact Info

Product

Resources

About

Simultaneous Sparsity and Parameter Tying for Deep Learning Using Ordered Weighted ℓ₁ Regularization