Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005.
DOI: 10.1109/ijcnn.2005.1555994
|View full text |Cite
|
Sign up to set email alerts
|

Second-order backpropagation algorithms for a stagewise-partitioned separable Hessian matrix

Abstract: Recent advances in computer technology allow the implementation of some important methods that were assigned lower priority in the past due to their computational burdens. Second-order backpropagation (BP) is such a method that computes the exact Hessian matrix of a given objective function. We describe two algorithms for feed-forward neural-network (NN) learning with emphasis on how to organize Hessian elements into a so-called stagewise-partitioned block-arrow matrix form:(1) stagewise BP, an extension of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(16 citation statements)
references
References 12 publications
0
16
0
Order By: Relevance
“…Needless to say, the latter approach has no chance to exploit negative curvature. Fortunately, the local and global Hessian matrices can be evaluated efficiently by our recently-developed second-order stagewise backpropagation at the essentially same cost of the Gauss-Newton Hessian part in CANFIS neuro-fuzzy modular-network learning [19] (as well as in MLP-learning [9], [8]). …”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Needless to say, the latter approach has no chance to exploit negative curvature. Fortunately, the local and global Hessian matrices can be evaluated efficiently by our recently-developed second-order stagewise backpropagation at the essentially same cost of the Gauss-Newton Hessian part in CANFIS neuro-fuzzy modular-network learning [19] (as well as in MLP-learning [9], [8]). …”
Section: Discussionmentioning
confidence: 99%
“…Then, exploiting negative curvature turns out to be effective in order to avoid locking onto so-called singular points. Here, our second-order stagewise backpropagation procedure [19], [9], [8] is an indispensable element that makes it very practical to find a descent direction of negative curvature, for it evaluate the entire Hessian H very efficiently (at the essentially same cost for J T J alone).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In the best-known BP formulation due to Rumelhart et al [3], x s , the vector of "beforenode" net inputs [see Equation (6)], is treated as the state vector, whereas in optimal control, y s , the vector of "afternode" outputs, is chosen as the state vector. , two hidden nodes (P 2 = 2), and three terminal outputs (F ≡ P 3 = 3); hence, 13 parameters in total including threshold parameters: (a) The desired block-arrow Hessian matrix, whose arrowhead should point downward to the right (see [4], [5]), with F (= 3) diagonal blocks; (b) A Hessian matrix with a complex sparse pattern, which is hard to exploit, obtained by the NETLAB (MATLABbased software) (see mlphess.m at http://www.ncrg.aston.ac.uk/netlab/). For large-scale optimization, it is not recommendable to approximate the inverse of the Hessian because it always becomes dense.…”
Section: Introductionmentioning
confidence: 99%