Neural Network Supervised Training Based on a Dimension Reducing Method

Int. J. Bifurcation Chaos

2006

Networks of neurons can perform computations that even modern computers find very difficult to simulate. Most of the existing artificial neurons and artificial neural networks are considered biologically unrealistic, nevertheless the practical success of the backpropagation algorithm and the powerful capabilities of feedforward neural networks have made neural computing very popular in several application areas. A challenging issue in this context is learning internal representations by adjusting the weights of the network connections. To this end, several firstorder and second-order algorithms have been proposed in the literature. This paper provides an overview of approaches to backpropagation training, emphazing on first-order adaptive learning algorithms that build on the theory of nonlinear optimization, and proposes a framework for their analysis in the context of deterministic optimization.

Section: Formulation Of the Supervised Training Problemmentioning

confidence: 99%

Section: Adaptive Learning Rate Algorithms In An Optimization Contextmentioning

confidence: 99%

Section: Lipschitz Constant Estimation For Learning Rate Adaptationmentioning

confidence: 99%

Section: Lipschitz Constant Estimation For Learning Rate Adaptationmentioning

confidence: 99%

See 2 more Smart Citations

Adaptive Algorithms for Neural Network Supervised Learning: A Deterministic Optimization Approach

Magoulas

Int. J. Bifurcation Chaos

2006

“…This equation formulates the error function to be minimized, in which t j p speci es the desired response at the j{th neuron of the output layer at the input pattern p and y L j p is the output at the j{th neuron of the output layer L that depends on the weights of the network and is a nonlinear activation function, such as the well known sigmoid Attempts to speed up back{propagation training have been made by dynamically adapting the stepsize during training 9, 20], or by using second derivative related information 11,13,19]. However, these BP{ like training algorithms occasionally converge to local minima which a ect the e ciency of the learning process.…”

Section: Introductionmentioning

confidence: 99%

Supervised Training Using Global Search Methods

Plagianakos

Magoulas

Nonconvex Optimization and Its Applications

2001

Self Cite

Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have n o m e c hanism that allows them to escape the in uence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may h a ve minima at di erent points, which sometimes results in a nonconvex error function. This work investigates the use of global search methods for batch-mode training of feedforward multilayer perceptrons. Global search methods are expected to lead to \optimal" or \near-optimal" weight con gurations by a l l o wing the neural network to escape local minima during training and, in that sense, they improve the e ciency of the learning process. The paper reviews the fundamentals of simulated annealing, genetic and evolutionary algorithms as well as some recently proposed de ection procedures. Simulations and comparisons are presented.

Global Optimization for Imprecise Problems

Nonconvex Optimization and Its Applications

Sotiropoulos

Triantafyllou

1997

Self Cite

Abstract.A new method for the computation of the global minimum of a continuously differentiable real-valued function f of n variables is presented. This method, which is composed of two parts, is based on the combinatorial topology concept of the degree of a mapping associated with an oriented polyhedron. In the first part, interval arithmetic is implemented for a "rough" isolation of all the stationary points of f . In the second part, the isolated stationary points are characterized as minima, maxima or saddle points and the global minimum is determined among the minima. The described algorithm can be successfully applied to problems with imprecise function and gradient values. The algorithm has been implemented and tested. It is primarily useful for small dimensions (n ≤ 10).