2006
DOI: 10.1109/tnn.2005.863460
|View full text |Cite
|
Sign up to set email alerts
|

Convergence of Gradient Method With Momentum for Two-Layer Feedforward Neural Networks

Abstract: A gradient method with momentum for two-layer feedforward neural networks is considered. The learning rate is set to be a constant and the momentum factor an adaptive variable. Both the weak and strong convergence results are proved, as well as the convergence rates for the error function and for the weight. Compared to the existing convergence results, our results are more general since we do not require the error function to be quadratic.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
33
0

Year Published

2006
2006
2020
2020

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 68 publications
(33 citation statements)
references
References 11 publications
0
33
0
Order By: Relevance
“…Notice that (24) represents a stochastic counterpart of (19).) By virtue of the well-known Doob's theorem [25], the property (24) yields …”
Section: Preliminariesmentioning
confidence: 99%
See 2 more Smart Citations
“…Notice that (24) represents a stochastic counterpart of (19).) By virtue of the well-known Doob's theorem [25], the property (24) yields …”
Section: Preliminariesmentioning
confidence: 99%
“…Unfortunately, this gives that the learning goes faster in the beginning and slows down in the late stage. The convergence analysis of learning algorithm with deterministic (non-stochastic) nature has been given in [17][18][19][20][21][22]. In contrast to the stochastic approach, several of these results allow to employ a constant learning rate [19,23].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Once the error gradient is derived, numerous optimization algorithms for minimizing can be applied to train PyraNet [20]- [22]. In this paper, we focus on five representative training algorithms, namely gradient descent (GD) [23], gradient descent with momentum and variable learning rate (GDMV) [24], resilient backpropagation (RPROP) [25], conjugate gradient (CG) [20], and Levenberg-Marquardt (LM) [26].…”
Section: B Pyranet Training Algorithmsmentioning
confidence: 99%
“…These techniques include such idea as varying the learning rate, using momentum and gain tuning of activation function. In [16] some convergence results are given where the learning fashion of training examples is batch learning. These results are of global nature in that they are valid for any arbitrarily given initial value of weights.…”
mentioning
confidence: 99%