2022
DOI: 10.1016/j.neunet.2022.02.016
|View full text |Cite
|
Sign up to set email alerts
|

Universality of gradient descent neural network training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 49 publications
0
4
0
Order By: Relevance
“…Neurons are combined according to a certain layered structure, and a neural network is thus created. In practice, as an activation function is required to satisfy the firing from neurons, the traditional sigmoid gradient function is prone to gradient disappearance (Welper, 2022), which makes it difficult to optimize important gradients.…”
Section: Algorithm Preprocessingmentioning
confidence: 99%
See 1 more Smart Citation
“…Neurons are combined according to a certain layered structure, and a neural network is thus created. In practice, as an activation function is required to satisfy the firing from neurons, the traditional sigmoid gradient function is prone to gradient disappearance (Welper, 2022), which makes it difficult to optimize important gradients.…”
Section: Algorithm Preprocessingmentioning
confidence: 99%
“…For the intrinsic calculation of neural networks, back propagation is required to calculate the gradient correlation. However, due to the dependence of back propagation on memory and the consideration of overfitting, a direct error report is avoided (Welper, 2022), and an intermediate calculation is automatically released at the end of a gradient. The classification results for the coal structure can be obtained by inputting logging data through this training model.…”
Section: Neural Network Structurementioning
confidence: 99%
“…Generally, we use gradient descent method to train the network [5]. However, when we use the squared error sum function to train the SPSNN, the inappropriate number of hidden layer neurons and the magnitude of the weights may lead to underfitting or overfitting of the model, which affects the performance and generalization ability of the model.…”
Section: Introductionmentioning
confidence: 99%
“…The ELM algorithm was developed to address these issues. Since the 2 of 14 desired accuracy and stability of ELM cannot be met, and the weights and thresholds are randomly generated before training, it is typically optimized [8]. The benefits of the swarm intelligence optimization algorithm include strong parallelism, independent exploration, ease of use, quick convergence, etc.…”
Section: Introductionmentioning
confidence: 99%