2020
DOI: 10.1098/rspa.2020.0334
|View full text |Cite
|
Sign up to set email alerts
|

Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks

Abstract: We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing a scalable parameter in each layer (layer-wise) and for every neuron (neuron-wise) separately, and then optimizing it using a variant of stochastic gradient descent algorithm. In order to further increase the training speed, an… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
91
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 207 publications
(92 citation statements)
references
References 15 publications
0
91
0
1
Order By: Relevance
“…The last activation function is the adaptive swish function where a is an additional parameter and is optimized in the training process [14,15]. Overall, the DNN approximation of PDE solution can be written as…”
Section: Network Structurementioning
confidence: 99%
“…The last activation function is the adaptive swish function where a is an additional parameter and is optimized in the training process [14,15]. Overall, the DNN approximation of PDE solution can be written as…”
Section: Network Structurementioning
confidence: 99%
“…Layer-wise introduction of the additional parameters a k changes the slope of activation function in each hidden-layer, thereby increasing the training speed. Moreover, these activation slopes can also contribute to the loss function through the slope recovery term, see [21,22] for more details. Such locally adaptive activation functions enhance the learning capacity of the network, especially during the early training period.…”
Section: Mathematical Setup For Fully Connected Neural Networkmentioning
confidence: 99%
“…The gradient dynamics of the adaptive activation modifies the standard dynamics (fixed activation) by multiplying a conditioning matrix by the gradient and by adding the approximate second-order term. In this paper, we used scaling factor n=5 for all hidden-layers and initialize na k =1, ∀k, see [22] for details.…”
Section: Mathematical Setup For Fully Connected Neural Networkmentioning
confidence: 99%
“…Other examples include networks with an adaptive polynomial activation function [ 33 ], slope varying activation function [ 34 ] or back-propagation modification resulting in AAF [ 34 , 35 ]. In parallel with our work, a neuron-wise and layer-wise adaptive activation function for physics-informed neural networks was presented in [ 36 ].…”
Section: Introductionmentioning
confidence: 99%