Cosme Louart scite author profile

This article studies the Gram random matrix model G = 1 T Σ T Σ, Σ = σ(W X), classically found in the analysis of random feature maps and random neural networks, where X = [x1, . . . , xT ] ∈ R p×T is a (data) matrix of bounded norm, W ∈ R n×p is a matrix of independent zero-mean unit variance entries, and σ : R → R is a Lipschitz continuous (activation) function -σ(W X) being understood entrywise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as n, p, T grow large at the same rate, the resolvent Q = (G + γIT ) −1 , for γ > 0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Φ = T n E[G], which provides in passing a deterministic equivalent for the empirical spectral measure of G. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.

show abstract

A Random Matrix Approach to Neural Networks

Louart¹,

Liao²,

Couillet³

2017

Preprint

View full text Add to dashboard Cite

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Seddik¹,

Louart²,

Tamaazousti³

et al. 2020

Preprint

View full text Add to dashboard Cite

Harnessing neural networks: A random matrix approach

Louart

Couillet

2017

View full text Add to dashboard Cite

This article proposes an original approach to the performance understanding of large dimensional neural networks. In this preliminary study, we study a single hidden layer feed-forward network with random input connections (also called extreme learning machine) which performs a simple regression task. By means of a new random matrix result, we prove that, as the size and cardinality of the input data and the number of neurons grow large, the network performance is asymptotically deterministic. This entails a better comprehension of the effects of the hyper-parameters (activation function, number of neurons, etc.) under this simple setting, thereby paving the path to the harnessing of more involved structures.

show abstract

Large Dimensional Asymptotics of Multi-Task Learning

Tiomoko

Louart

Couillet

2020

View full text Add to dashboard Cite

Multi Task Learning (MTL) efficiently leverages useful information contained in multiple related tasks to help improve the generalization performance of all tasks. This article conducts a large dimensional analysis of a simple but, as we shall see, extremely powerful when carefully tuned, Least Square Support Vector Machine (LSSVM) version of MTL, in the regime where the dimension p of the data and their number n grow large at the same rate. Under mild assumptions on the input data, the theoretical analysis of the MTL-LSSVM algorithm first reveals the "sufficient statistics" exploited by the algorithm and their interaction at work. These results demonstrate, as a striking consequence, that the standard approach to MTL-LSSVM is largely suboptimal, can lead to severe effects of negative transfer but that these impairments are easily corrected. These corrections are turned into an improved MTL-LSSVM algorithm which can only benefit from additional data, and the theoretical performance of which is also analyzed. As evidenced and theoretically sustained in numerous recent works, these large dimensional results are robust to broad ranges of data distributions, which our present experiments corroborate. Specifically, the article reports a systematically close behavior between theoretical and empirical performances on popular datasets, which is strongly suggestive of the applicability of the proposed carefully tuned MTL-LSSVM method to real data. This fine-tuning is fully based on the theoretical analysis and does not in particular require any cross validation procedure. Besides, the reported performances on real datasets almost systematically outperform much more elaborate and less intuitive state-of-the-art multi-task and transfer learning methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cosme Louart

A random matrix approach to neural networks

A Random Matrix Approach to Neural Networks

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Harnessing neural networks: A random matrix approach

Large Dimensional Asymptotics of Multi-Task Learning

Contact Info

Product

Resources

About