Harnessing neural networks: A random matrix approach

Louart, Cosme; Couillet, Romain

doi:10.1109/icassp.2017.7952563

Cited by 4 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, the successes of deep learning along with the disqualifying complexity of studying real world problems have sparked a revived interest in the direction of random weight matrices. Recent results-without exhaustivity-were obtained on the spectrum of the Gram matrix at each layer using random matrix theory [32,33], on expressivity of deep neural networks [34], on the dynamics of propagation and learning [35][36][37][38], on the high-dimensional non-convex landscape where the learning takes place [39], or on the universal random Gaussian neural nets of [40].…”

Section: Other Related Workmentioning

confidence: 99%

Entropy and mutual information in models of deep neural networks*

Manoel²,

et al. 2019

View full text Add to dashboard Cite

We examine a class of stochastic deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) we show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive.

show abstract

Section: Other Related Workmentioning

confidence: 99%

Entropy and mutual information in models of deep neural networks*

Manoel²,

et al. 2019

View full text Add to dashboard Cite

show abstract

“…In this article, following our seminal works [5,6], we propose a different angle of approach to neural network analysis. Rather than modelling a complete deep neural net, we focus here primarily on simple network structures, so far not considering backpropagation learning but accounting for nonlinearities induced when traversing a hidden layer.…”

Section: Introductionmentioning

confidence: 99%

“…In [5], we merely exploited Feature (ii) as a technical means to study the asymptotic (as n, p → ∞) performance of extreme learning machines (ELM) [7] (i.e., single hiddenlayer regression networks with no backpropagation learning), assuming a model encompassing a random connectivity matrix (which induces the concentration of the output vectors) but deterministic data. Under this model, however, while the asymptotic network training performance was readily accessible, the asymptotic generalization performance remained out of technical grasp and only a conjecture under "reasonable" yet unclear assumptions on the deterministic dataset could be proposed.…”

Section: Introductionmentioning

confidence: 99%

A Random Matrix and Concentration Inequalities Framework for Neural Networks Analysis

Louart

Couillet

2018

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

This article provides a theoretical analysis of the asymptotic performance of a regression or classification task performed by a simple random neural network. This result is obtained by leveraging a new framework at the crossroads between random matrix theory and the concentration of measure theory. This approach is of utmost interest for neural network analysis at large in that it naturally dismisses the difficulty induced by the non-linear activation functions, so long that these are Lipschitz functions. As an application, we provide formulas for the limiting law of the random neural network output and compare them conclusively to those obtained practically on handwritten digits databases.

show abstract

“…Based on similar insights that curse of dimensionality may harm RF or ELM's capabilities, in [195] they propose deep semi-random features as an alternative, which is shown to have better expressive power than RF, and better ganeralization error bound than common deep neural networks. In a different sense, in [196] the authors study the characteristics of ELM via Random Matrix Theory.…”

Section: Extreme Learning Machinementioning

confidence: 99%

Structured sparse representations for supervised and unsupervised learning

Zeng¹

View full text Add to dashboard Cite

The technique is particularly important to structural inputs such as images (natural or hyper-spectral), since the convolution operator is shift-invariant and does not break the spatial configurations. This part of the thesis is particularly dedicated to exploring online methods for efficient CDL algorithms where input signals are provided in a streaming fashion to update the dictionary, because the batch mode CDL is time and memory-consuming in nature.The methodology adopted in this part is a systematic extension of classical unsupervised online sparse coding algorithms to the convolutional sparse coding domain, with efficient optimizations achieved via a localized "slice-based" representation of sparse features. Specifically, a slice-based online convolutional dictionary learning (SOCDL) method is proposed for unsupervised image processing tasks such as image reconstruction and inpainting. The expected reconstruction cost with respect to the dictionary is approximated via a surrogate function to enable online learning, and under the slice-based representation a second-order stochastic approximation approach is proposed for optimization. The approximation efficiently generates a dictionary sequence which converges favorably with desired characteristics for image reconstruction. Convergence analysis is presented with theoretical justifications, and the computational complexity is among the lowest within the context of online CDL algorithms with least memory consumptions. Finally, this thesis proposes a novel masked version of SOCDL which can perform online learning on incomplete data for image inpainting.

show abstract

Harnessing neural networks: A random matrix approach

Cited by 4 publications

References 9 publications

Entropy and mutual information in models of deep neural networks*

Entropy and mutual information in models of deep neural networks*

A Random Matrix and Concentration Inequalities Framework for Neural Networks Analysis

Structured sparse representations for supervised and unsupervised learning

Contact Info

Product

Resources

About