2021
DOI: 10.1038/s41467-021-26568-2
|View full text |Cite
|
Sign up to set email alerts
|

Correspondence between neuroevolution and gradient descent

Abstract: We show analytically that training a neural network by conditioned stochastic mutation or neuroevolution of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations, for shallow and deep neural net… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

3
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 34 publications
3
11
0
Order By: Relevance
“…These results, and the correspondence described in Ref. [17] establish both theoretically and empirically the ability of Metropolis MC to train neural nets.…”
Section: Montesupporting
confidence: 79%
See 1 more Smart Citation
“…These results, and the correspondence described in Ref. [17] establish both theoretically and empirically the ability of Metropolis MC to train neural nets.…”
Section: Montesupporting
confidence: 79%
“…This class of algorithm involves taking a neural network of fixed structure, adding random numbers to all weights and biases simultaneously, and accepting the change if the loss function does not increase. For uncorrelated Gaussian random numbers this procedure is equivalent, for small updates, to clipped gradient descent in the presence of Gaussian white noise [15][16][17] [18], and so its ability to train a neural network should be similar to that of simple gradient descent (GD). We show in Section II A that, for a particular supervised-learning problem, this is the case, a finding consistent with results presented by other authors [6][7][8].…”
Section: Introductionmentioning
confidence: 99%
“…In [ 65 ] the authors show that neuroevolution performs the same as gradient descent on the loss function in the presence of Gaussian white noise. In this study numerical simulations were performed in order to illustrate the correspondence between the two methods which can be detected when applied to shallow and deep neural network.…”
Section: Related Workmentioning
confidence: 99%
“…One way to optimize the network's architecture is through neuroevolution, [14]. Neuroevolution proved to have similar performance as the gradient descent algorithm when applied on the loss function in the case of Gaussian white noise, [15]. Obviously, this view is not shared by everyone.…”
Section: Introductionmentioning
confidence: 99%