2021
DOI: 10.1088/1751-8121/ac38ec
|View full text |Cite
|
Sign up to set email alerts
|

A transport equation approach for deep neural networks with quenched random weights

Abstract: We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines with quenched random weights and solve for its free energy in the thermodynamic limit by means of Guerra's interpolating techniques under the RS and 1RSB ansatz. In particular, we recover the expression already known for the replica-symmetric case. Further, we drop the restriction constraint by introducing intra-layer connections among spins and we show that the resulting system can be mapped into a m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 40 publications
0
3
0
Order By: Relevance
“…The latter, accounting for RSB phenomena, can be obtained by iteratively perturbing the RS interpolation scheme (e.g. see [47,37,48]), thus, our results find direct application on the practical side and provide the starting point for further refinements on the theoretical side.…”
Section: Generalities and Notationmentioning
confidence: 79%
See 1 more Smart Citation
“…The latter, accounting for RSB phenomena, can be obtained by iteratively perturbing the RS interpolation scheme (e.g. see [47,37,48]), thus, our results find direct application on the practical side and provide the starting point for further refinements on the theoretical side.…”
Section: Generalities and Notationmentioning
confidence: 79%
“…In this case, we can take advantage of rigorous mathematical methods by applying sum rules [26] or by mapping the relevant quantities (the free energy or the model order parameters) of the statistical setting to the solutions of PDE systems. Indeed, differential equations involving the partition functions (or related quantities) of thermodynamic models have been extensively investigated in the literature, see for example [27][28][29][30][31][32][33][34][35][36][37]. In particular, they allow us to express the equation of state (or the self-consistency equations) governing the equilibrium dynamics of the system in terms of solutions of non-linear differential equations, and to describe phase transition phenomena as the development of shock waves, thus linking critical behaviours to gradient catastrophe theory [38][39][40][41].…”
Section: Introductionmentioning
confidence: 99%
“…For example, the Hopfield model has been shown to be equivalent to a restricted Boltzmann machine [21], an archetypical model for machine learning [22], and sparse restricted Boltzmann machines have been mapped to Hopfield models with diluted patterns [23,24]. Furthermore, restricted Boltzmann machines with generic priors have led to the definition of generalized Hopfield models [25][26][27] and neural networks with multi-node Hebbian interactions have recently been shown to be equivalent to higher-order Boltzmann machines [28,29] and deep Boltzmann machines [30,31]. As a result, multi-node Hebbian learning is receiving a second wave of interest since its foundation in the eighties [13,14] as a paradigm to understand deep learning [17,32].…”
Section: Introductionmentioning
confidence: 99%