IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222)
DOI: 10.1109/ijcnn.2001.939044
|View full text |Cite
|
Sign up to set email alerts
|

On complexity analysis of supervised MLP-learning for algorithmic comparisons

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 41 publications
(24 citation statements)
references
References 6 publications
0
24
0
Order By: Relevance
“…where e x is the transcendental exponential function [Mizutani and Dreyfus 2001]. As can be seen from examination of Eq.…”
Section: Neural-network Predictorsmentioning
confidence: 86%
See 1 more Smart Citation
“…where e x is the transcendental exponential function [Mizutani and Dreyfus 2001]. As can be seen from examination of Eq.…”
Section: Neural-network Predictorsmentioning
confidence: 86%
“…Furthermore, for some arbitrary real-valued, scalar variable x, we denote the number of flops required to execute the transcendental exponential function e x by T e and that required to execute the square root function x 1/2 by T s . Evaluating a transcendental exponential function e x can take considerably more processing cycles than a standard arithmetic operation and can account for a significantly large proportion of the total estimated execution time [Mizutani and Dreyfus 2001]. Fast approximations to the exponential function, such as that proposed by Cawley [2000], can thus be employed to reduce the execution time, if required.…”
Section: Theoretical Computational Analysismentioning
confidence: 99%
“…5(a)] trained with a steepest descent-type online pattern-by-pattern mode learning (or an incremental gradient) algorithm in conjunction with backpropagation (BP) [23], [24], wherein (momentum term) 0.8, , a learning rate (or step size) for the parameters between the output and hidden layers, and for those between the hidden and input layers.…”
Section: Multiple-illuminant Experimentsmentioning
confidence: 99%
“…(25) and (26). Yet, the stagewise computation by first-order BP can be viewed in such a way that the gradients are efficiently computed (without forming such sparse block-diagonal matrices explicitly) by the outer product 6'+1y T, which produces a P8+l-by-(1 + P8) matrix G88'+l of gradients [7] associated with the same-sized matrix 88',8+± of parameters; here, column i of G',8+l is given as a P,+,-vector g',S+l for 0,s+1. Again, the resulting gradient matrix G-,8+l can be reshaped to an n8-length gradient vector g',8+l in the same manner as shown in Eq.…”
Section: Stagewise First-order Backpropagationmentioning
confidence: 99%
“…No decisions are to be made at terminal stage N (or layer N); hence, the N-1 decision stages in total. To compute the gradient vector for optimization purposes, we employ the "first-order" backpropagation (BP) process [5], [6], [7], which consists of two major procedures: forward pass and backward pass [see later Eq. (2)].…”
mentioning
confidence: 99%