1995
DOI: 10.1103/physrevlett.74.4337
|View full text |Cite
|
Sign up to set email alerts
|

Exact Solution for On-Line Learning in Multilayer Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
260
2

Year Published

1997
1997
2009
2009

Publication Types

Select...
6
4

Relationship

3
7

Authors

Journals

citations
Cited by 161 publications
(262 citation statements)
references
References 5 publications
0
260
2
Order By: Relevance
“…Since having a discrete teacher is merely a special case, not using the knowledge that the teacher is confined to a discrete set of values gives the well-known results; an exponential decay in the case of continuous rule (on-line learning [12,13]) and a power law decay in the case of binary rule (on-line and off-line learning [14][15][16][17][18]). The way to gain from the knowledge of the discrete nature of the weights is in the center of our work, and it is based on having in addition a discrete student W S derived from the continuous one using the following clipping procedure.…”
Section: B Dynamics Of the Weightsmentioning
confidence: 99%
“…Since having a discrete teacher is merely a special case, not using the knowledge that the teacher is confined to a discrete set of values gives the well-known results; an exponential decay in the case of continuous rule (on-line learning [12,13]) and a power law decay in the case of binary rule (on-line and off-line learning [14][15][16][17][18]). The way to gain from the knowledge of the discrete nature of the weights is in the center of our work, and it is based on having in addition a discrete student W S derived from the continuous one using the following clipping procedure.…”
Section: B Dynamics Of the Weightsmentioning
confidence: 99%
“…The study of online backpropagation as put forward by Biehl and Schwarze [5] and later developed in [6,7] has permitted the analytical understanding of several properties of the dynamics of the learning process. The most striking feature being the existence of learning plateaux or symmetric phases which signal learning stages where the information available to the student and the form in which it is used do not permit breaking the permutation symmetry among the hidden nodes.…”
mentioning
confidence: 99%
“…A common choice for the transfer function is g(x) = erf(x/ √ 2). With this specific choice, the averaging in the equations of motion (30,31,32) can be performed analytically for general K and M [22,23,24] Independent of the particular choice of learning algorithms a general problem in two-layered networks is caused by the inherent permutation symmetry: The i-th input branch of the adaptive network (24) does not necessarily specialize on the i-th branch in the network (25). Without loss of generality, however, one can relabel the dynamical variables such as if this were indeed the case.…”
Section: XImentioning
confidence: 99%