Correspondence between neuroevolution and gradient descent

Whitelam, Stephen; Selin, Viktor; Park, Sang-Won; Tamblyn, Isaac

doi:10.1038/s41467-021-26568-2

Cited by 15 publications

(14 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These results, and the correspondence described in Ref. [17] establish both theoretically and empirically the ability of Metropolis MC to train neural nets.…”

Section: Montesupporting

confidence: 79%

“…This class of algorithm involves taking a neural network of fixed structure, adding random numbers to all weights and biases simultaneously, and accepting the change if the loss function does not increase. For uncorrelated Gaussian random numbers this procedure is equivalent, for small updates, to clipped gradient descent in the presence of Gaussian white noise [15][16][17] [18], and so its ability to train a neural network should be similar to that of simple gradient descent (GD). We show in Section II A that, for a particular supervised-learning problem, this is the case, a finding consistent with results presented by other authors [6][7][8].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Training neural networks using Metropolis Monte Carlo and an adaptive variant

Whitelam¹,

Selin²,

Benlolo³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

We examine the zero-temperature Metropolis Monte Carlo algorithm as a tool for training a neural network by minimizing a loss function. We find that, as expected on theoretical grounds and shown empirically by other authors, Metropolis Monte Carlo can train a neural net with an accuracy comparable to that of gradient descent, if not necessarily as quickly. The Metropolis algorithm does not fail automatically when the number of parameters of a neural network is large. It can fail when a neural network's structure or neuron activations are strongly heterogenous, and we introduce an adaptive Monte Carlo algorithm, aMC, to overcome these limitations. The intrinsic stochasticity of the Monte Carlo method allows aMC to train neural networks in which the gradient is too small to allow training by gradient descent. We suggest that, as for molecular simulation, Monte Carlo methods offer a complement to gradient-based methods for training neural networks, allowing access to a distinct set of network architectures and principles.

show abstract

“…These results, and the correspondence described in Ref. [17] establish both theoretically and empirically the ability of Metropolis MC to train neural nets.…”

Section: Montesupporting

confidence: 79%

Section: Introductionmentioning

confidence: 99%

Training neural networks using Metropolis Monte Carlo and an adaptive variant

Whitelam¹,

Selin²,

Benlolo³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…In [ 65 ] the authors show that neuroevolution performs the same as gradient descent on the loss function in the presence of Gaussian white noise. In this study numerical simulations were performed in order to illustrate the correspondence between the two methods which can be detected when applied to shallow and deep neural network.…”

Section: Related Workmentioning

confidence: 99%

Learning deep neural networks' architectures using differential evolution. Case study: Medical imaging processing

Belciug

2022

Computers in Biology and Medicine

View full text Add to dashboard Cite

“…One way to optimize the network's architecture is through neuroevolution, [14]. Neuroevolution proved to have similar performance as the gradient descent algorithm when applied on the loss function in the case of Gaussian white noise, [15]. Obviously, this view is not shared by everyone.…”

Section: Introductionmentioning

confidence: 99%

Evolutionary Computation Paradigm to Determine Deep Neural Networks Architectures

Ivanescu

Belciug

Serbănescu

et al. 2022

INT J COMPUT COMMUN, Int. J. Comput. Commun. Control

View full text Add to dashboard Cite

Image classification is usually done using deep learning algorithms. Deep learning architectures are set deterministically. The aim of this paper is to propose an evolutionary computation paradigm that optimises a deep learning neural network’s architecture. A set of chromosomes are randomly generated, after which selection, recombination, and mutation are applied. At each generation the fittest chromosomes are kept. The best chromosome from the last generation determines the deep learning architecture. We have tested our method on a second trimester fetal morphology database. The proposed model is statistically compared with DenseNet201 and ResNet50, proving its competitiveness.

show abstract

Correspondence between neuroevolution and gradient descent

Cited by 15 publications

References 34 publications

Training neural networks using Metropolis Monte Carlo and an adaptive variant

Training neural networks using Metropolis Monte Carlo and an adaptive variant

Learning deep neural networks' architectures using differential evolution. Case study: Medical imaging processing

Evolutionary Computation Paradigm to Determine Deep Neural Networks Architectures

Contact Info

Product

Resources

About