Dynamic adaptation of the error surface for the acceleration of the training of neural networks

Thome, A.G.; Tenorio, M.F.

doi:10.1109/icnn.1994.374204

Cited by 3 publications

(5 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Figure 2 shows the average number of neurons which undergo the opposite transformation per epoch for the cancer dataset. As described above, the behavior is essentially governed by Equation (14). As such, a similar behavior was seen for all datasets.…”

Section: A Function Call Overhead Analysissupporting

confidence: 64%

“…V( i) is some userdefined distribution or probability decision rule and U (0, 1) is a uniformly distributed real number, and (14) and represents the neurons which have the last two successive outputs either above or below <p (0).…”

Section: * E [Lcur I Cur]mentioning

confidence: 99%

“…(1) where X represents the input dataset and Z is the set of all weights and biases of N. The error function used by output neurons is e(·). been investigated, for example see [14], [5]. Similarly, many alternative transfer functions have been proposed, refer to [2], [3] for a survey.…”

Section: A Intuitionmentioning

confidence: 99%

See 2 more Smart Citations

Improving gradient-based learning algorithms for large scale feedforward networks

Ventresca

Tizhoosh

2009

2009 International Joint Conference on Neural Networks

View full text Add to dashboard Cite

Large scale neural networks have many hundreds or thousands of parameters (weights and biases) to learn, and as a result tend to have very long training times. Small scale networks can be trained quickly by using second-order information, but these fail for large architectures due to high computational cost. Other approaches employ local search strategies, which also add to the computational cost. In this paper we present a simple method, based on opposite transfer functions which greatly improve the convergence rate and accuracy of gradientbased learning algorithms. We use two variants of the backpropagation algorithm and common benchmark data to highlight the improvements. We find statistically significant improvements in both converegence speed and accuracy.

show abstract

Section: A Function Call Overhead Analysissupporting

confidence: 64%

Section: * E [Lcur I Cur]mentioning

confidence: 99%

See 1 more Smart Citation

Improving gradient-based learning algorithms for large scale feedforward networks

Ventresca

Tizhoosh

2009

2009 International Joint Conference on Neural Networks

View full text Add to dashboard Cite

show abstract

“…Dynamically adjusting transfer function parameters, and thus modifying the error surface and input-output mapping has been investigated, for example see [11], [12]. Similarly, many alternative transfer functions have been proposed, refer to [8] for a survey.…”

Section: Opposite Networkmentioning

confidence: 99%

“…Investigating the impact of transfer functions on the error surface has also been researched recently [8], [9], [10]. Also, adaptive transfer functions have been considered as a possible means to help alleviate ill-conditioning and improve accuracy and training times [11], [12], [13], [14].…”

Section: Introductionmentioning

confidence: 99%

Numerical condition of feedforward networks with opposite transfer functions

Ventresca

Tizhoosh

2008

2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)

View full text Add to dashboard Cite

Abstract-Numerical condition affects the learning speed and accuracy of most artificial neural network learning algorithms. In this paper, we examine the influence of opposite transfer functions on the conditioning of feedforward neural network architectures. The goal is not to discuss a new training algorithm nor error surface geometry, but rather to present characteristics of opposite transfer functions which can be useful for improving existing or to develop new algorithms. Our investigation examines two situations: (1) network initialization, and (2) early stages of the learning process. We provide theoretical motivation for the consideration of opposite transfer functions as a means to improve conditioning during these situations. These theoretical results are validated by experiments on a subset of common benchmark problems. Our results also reveal the potential for opposite transfer functions in other areas of, and related to neural networks.

show abstract

SHAKE-A multi-criterion optimization scheme for neural network training

Pacheco

Thome²

Proceedings of International Conference on Neural Networks (ICNN'96)

View full text Add to dashboard Cite

This paper presents a new approach for the task of feedforward type neural network training process based on a multi-criterion efficiency measurement. Here we propose a novel hybrid neuro-genetic algorithm that tries to optimize a three dimension criterion vector composed by speed, accuracy, and percentage of convergence, which measures the overall stability of the training algorithm to converge to good minimal. The proposed approach takes the speed advantage of the conventional algorithms as well as the accuracy and percentage of convergence advantages of the genetic algorithms. The empirical results obtained up to now shows the strength and potentiality of the method.

show abstract

Dynamic adaptation of the error surface for the acceleration of the training of neural networks

Cited by 3 publications

References 4 publications

Improving gradient-based learning algorithms for large scale feedforward networks

Improving gradient-based learning algorithms for large scale feedforward networks

Numerical condition of feedforward networks with opposite transfer functions

SHAKE-A multi-criterion optimization scheme for neural network training

Contact Info

Product

Resources

About