Optimal Learning with a Neural Network

Watkin, T. L. H.

doi:10.1209/0295-5075/21/8/013

Cited by 47 publications

(29 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With this cost function, the optimal generalizer may be found by a simple gradient descent, with neither the need to train an infinite number of perceptrons for implementing a commitee machine, as was suggested by Opper and Haussler [8], nor to determine a large number of 'samplers' of the version space, as proposed by Watkin [10]. Once the potential is known, it is straightforward to calculate the distribution of stabilities of the training set:…”

Section: Theoretical Resultsmentioning

confidence: 99%

“…The fact that the bayesian student has patterns at vanishing distance from the hyperplane, and has most patterns at distances larger than κ, allows us to conclude that its weight vector lies close to the boundary of the version space. It has been shown [10] that the bayesian weight vector is the barycenter of the (strictly convex) version space. Our result means that the barycenter of the version space is far from its center, which is rather surprising, and might indicate that the version space is highly non-spherical.…”

Section: Theoretical Resultsmentioning

confidence: 99%

“…The generalization performance of these algorithms is very close to the bayesian optimal value. Some of them end up with a finite fraction of training errors, suggesting that the optimal solution might lie outside the version space, but it has been established that this is not the case [10]. More recently, the cost function that minimizes the generalization error, was determined through a variational approach, and it was showed that its minimum endows the perceptron with the optimal, bayesian, generalization performance [11].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Finite size scaling of the Bayesian perceptron

1997

View full text Add to dashboard Cite

We study numerically the properties of the bayesian perceptron through a gradient descent on the optimal cost function. The theoretical distribution of stabilities is deduced. It predicts that the optimal generalizer lies close to the boundary of the space of (error-free) solutions. The numerical simulations are in good agreement with the theoretical distribution. The extrapolation of the generalization error to infinite input space size agrees with the theoretical results. Finite size corrections are negative and exhibit two different scaling regimes, depending on the training set size. The variance of the generalization error vanishes for N → ∞ confirming the property of self-averaging.

show abstract

Section: Theoretical Resultsmentioning

confidence: 99%

Section: Theoretical Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Finite size scaling of the Bayesian perceptron

1997

View full text Add to dashboard Cite

show abstract

“…ANN application areas includes data classification and pattern recognition (Ripley, 1996), damage detection and earthquake simulation (Pei et al, 2006), function approximation (Toh, 1999;Ye and Lin, 2003), material science (Bhadeshia, 1999), experimental design of engineering systems (Röpke et al, 2005), nonlinear optimization (Malek et al, 2010), polypeptide structure prediction (Dorn and de Souza, 2010), prediction of trading signals of stock market indices (Tilakaratne et al, 2008), regression analysis (De Veux et al, 1998), signal and image processing (Watkin, 1993;Masters, 1994), time series analysis and forecasting (Franses and van Dijk, 2000;Kajitani et al, 2005).…”

Section: Discussionmentioning

confidence: 99%

“…ANN model parameterization frameworks and numerical studies are presented and discussed e.g. by Watkin (1993), Prechelt (1994), Bianchini and Gori (1996), Sexton et al (1998), Jordanov and Brown (1999), Toh (1999), Ye and Lin (2003), Abraham (2004), Hamm et al (2007).…”

Section: Postulating and Calibrating A Model Instancesmentioning

confidence: 99%

Calibrating artificial neural networks by global optimization

Pintér

2012

Expert Systems with Applications

View full text Add to dashboard Cite

An artificial neural network (ANN) is a computational model − implemented as a computer program − that is aimed at emulating the key features and operations of biological neural networks. ANNs are extensively used to model unknown or unspecified functional relationships between the input and output of a "black box" system. In order to apply such a generic procedure to actual decision problems, a key requirement is ANN training to minimize the discrepancy between modeled and measured system output. In this work, we consider ANN training as a (potentially) multi-modal optimization problem. To address this issue, we introduce a global optimization (GO) framework and corresponding GO software. The practical viability of the GO based approach is illustrated by finding close numerical approximations of (one-dimensional, but non-trivial) functions.

show abstract