2018
DOI: 10.1103/physrevx.8.041006
|View full text |Cite
|
Sign up to set email alerts
|

Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Abstract: Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fullyvisible binary spin systems, our constructi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
31
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(33 citation statements)
references
References 58 publications
2
31
0
Order By: Relevance
“…They play the role of an interacting field among visible nodes. Usually the nodes are binary-valued (of Boolean type or Bernoulli distributed) but Gaussian distributions or more broadly arbitrary distributions on real-valued bounded support are also used [26], ultimately making RBMs adapted to more heterogeneous data sets. Here to simplify we assume that visible and hidden nodes will be taken as binary variables si, σj ∈ {−1, 1} (using ±1 values gives the advantage of working with symmetric equations hence avoiding to deal with the "hidden" biases on the variables that appear when considering binary {0, 1} variables).…”
Section: W Ijmentioning
confidence: 99%
See 1 more Smart Citation
“…They play the role of an interacting field among visible nodes. Usually the nodes are binary-valued (of Boolean type or Bernoulli distributed) but Gaussian distributions or more broadly arbitrary distributions on real-valued bounded support are also used [26], ultimately making RBMs adapted to more heterogeneous data sets. Here to simplify we assume that visible and hidden nodes will be taken as binary variables si, σj ∈ {−1, 1} (using ±1 values gives the advantage of working with symmetric equations hence avoiding to deal with the "hidden" biases on the variables that appear when considering binary {0, 1} variables).…”
Section: W Ijmentioning
confidence: 99%
“…The reason for this is that the Laplace distribution leads to less interference among modes than the Gaussian distribution, so that the modes will weakly interact in the mean-field equations. Solving equations (21,22,27,29) in absence of fields yields the following picture: one fixed point solution will typically have non-vanishing magnetizations {mα,mα} for all α such that wα ∈ [wmax − ∆w, wmax], where ∆w is approximately the gap ∆w(q,q) defined in (26). This solution is a degenerate ground state, all other solutions being obtained by independently reversing the signs of the condensed magnetizations (mα,mα).…”
Section: Non-linear Regimementioning
confidence: 99%
“…Recently, RBMs have gained renewed attention in physics since Carleo and Troyer [ 6 ] showed that a quantum many-body state could be efficiently represented by the RBM. Gabré et al and Tramel et al [ 7 ] employed the Thouless–Anderson–Palmer mean-field approximation, used for a spin glass problem, to replace the Gibbs sampling of contrastive-divergence training. Amin et al [ 8 ] proposed a quantum Boltzmann machine based on the quantum Boltzmann distribution of a quantum Hamiltonian.…”
Section: Introductionmentioning
confidence: 99%
“…Unsupervised learning [2] of visual features from huge amount of unlabeled videos available today, using deep networks can be a potential solution to the data hungriness of supervised algorithms. Autoencoders, Restricted Boltzmann Machines (RBM) and the likes [3,8,26] trained in a greedy layer-wise fashion have been one of the popular methods for learning visual features from images in an unsupervised manner. However, such approaches fail to discover higher level structures from the data, necessary in recognition tasks.…”
Section: Introductionmentioning
confidence: 99%