Lifted Neural Networks

Askari, Armin; Negiar, Geoffrey; Sambharya, Rajiv; Ghaoui, Laurent El

doi:10.48550/arxiv.1805.01532

Cited by 12 publications

(34 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of these has focused on using MIP formulations for already trained networks to provide adversarial samples that can improve network stability [1,11,23]. In a different stream, [2] propose using convex relaxations for training ANNs. The authors also explore non gradient-based approaches and initialized weights to accelerate convergence of gradient-based algorithms.…”

Section: Literature Reviewmentioning

confidence: 99%

A Mixed-Integer Programming Approach to Training Dense Neural Networks

Patil¹,

Mintz²

2022

Preprint

View full text Add to dashboard Cite

Artificial Neural Networks (ANNs) are prevalent machine learning models that have been applied across various real world classification tasks. ANNs require a large amount of data to have strong out of sample performance, and many algorithms for training ANN parameters are based on stochastic gradient descent (SGD). However, the SGD ANNs that tend to perform best on prediction tasks are trained in an end to end manner that requires a large number of model parameters and random initialization. This means training ANNs is very time consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANN models, we propose the use of alternative methods from the constrained optimization literature for ANN training and pretraining. In particular, we propose novel mixed integer programming (MIP) formulations for training fully-connected ANNs. Our formulations can account for both binary activation and rectified linear unit (ReLU) activation ANNs, and for the use of a log likelihood loss. We also develop a layer-wise greedy approach, a technique adapted for reducing the number of layers in the ANN, for model pretraining using our MIP formulations. We then present numerical experiments comparing our MIP based methods against existing SGD based approaches and show that we are able to achieve models with competitive out of sample performance that are significantly more parsimonious.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

A Mixed-Integer Programming Approach to Training Dense Neural Networks

Patil¹,

Mintz²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Thus, a natural idea in implicit learning is to keep the state vector as a variable in the training problem, resulting in a higher-dimensional (or, "lifted") expression of the training problem. The idea of lifting the dimension of the training problem in (non-implicit) deep learning by introducing "state" variables has been studied in a variety of works; a non-extensive list includes [17], [3], [10], [22], [23], [6] and [15]. Lifted models are trained using block coordinate descent methods, Alternating Direction Method of Multipliers (ADMM) or iterative, non-gradient based methods.…”

Section: Related Workmentioning

confidence: 99%

Implicit Deep Learning

Ghaoui¹,

Gu²,

Travacca³

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

We define a new class of "implicit" deep learning prediction rules that generalize the recursive rules of feedforward neural networks. These models are based on the solution of a fixed-point equation involving a single a vector of hidden features, which is thus only implicitly defined. The new framework greatly simplifies the notation of deep learning, and opens up new possibilities, in terms of novel architectures and algorithms, robustness analysis and design, interpretability, sparsity, and network architecture optimization.

show abstract

“…al propose a convex optimization approach based on a low-rank relaxation using the nuclear norm regularizer [25]. In [2], Askari et al consider neural net objectives which are convex over blocks of variables. A number of recent results considered the gradient descent method on the non-convex training objective, and proved that it recovers the planted model parameters under distributional assumptions on the training data [9,22,24].…”

Section: Related Work and Contributionsmentioning

confidence: 99%

Convex Relaxations of Convolutional Neural Nets

Bartan

Pilancı

2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

We propose convex relaxations for convolutional neural nets with one hidden layer where the output weights are fixed. For convex activation functions such as rectified linear units, the relaxations are convex second order cone programs which can be solved very efficiently. We prove that the relaxation recovers the global minimum under a planted model assumption, given sufficiently many training samples from a Gaussian distribution. We also identify a phase transition phenomenon in recovering the global minimum for the relaxation.

show abstract

Lifted Neural Networks

Cited by 12 publications

References 5 publications

A Mixed-Integer Programming Approach to Training Dense Neural Networks

A Mixed-Integer Programming Approach to Training Dense Neural Networks

Implicit Deep Learning

Convex Relaxations of Convolutional Neural Nets

Contact Info

Product

Resources

About