2019
DOI: 10.3390/e21070627
|View full text |Cite
|
Sign up to set email alerts
|

Smooth Function Approximation by Deep Neural Networks with General Activation Functions

Abstract: There has been a growing interest in expressivity of deep neural networks. However, most of the existing work about this topic focuses only on the specific activation function such as ReLU or sigmoid. In this paper, we investigate the approximation ability of deep neural networks with a broad class of activation functions. This class of activation functions includes most of frequently used activation functions. We derive the required depth, width and sparsity of a deep neural network to approximate any Hölder … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
50
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 67 publications
(52 citation statements)
references
References 30 publications
2
50
0
Order By: Relevance
“…However, Theorem 4.1 in Ohn and Kim (2019), which is stated for the reader's convenience in Appendix B, provides support for our observation, presented in Section 5, that sparsening the network (i.e., splitting) increases the approximation error. Hence, in what follows, we also consider the so called soft constraints approach using a fully connected network, where the static no arbitrage conditions (2) are favored by penalization, as opposed to imposed to hold exactly in the previous hard constraint approach.…”
Section: Soft Constraints Approachsupporting
confidence: 70%
See 2 more Smart Citations
“…However, Theorem 4.1 in Ohn and Kim (2019), which is stated for the reader's convenience in Appendix B, provides support for our observation, presented in Section 5, that sparsening the network (i.e., splitting) increases the approximation error. Hence, in what follows, we also consider the so called soft constraints approach using a fully connected network, where the static no arbitrage conditions (2) are favored by penalization, as opposed to imposed to hold exactly in the previous hard constraint approach.…”
Section: Soft Constraints Approachsupporting
confidence: 70%
“…Let the function being approximated, p ∈ H α,R ([0, 1] i ) be Hölder smooth with parameters α > 0 and R > 0, where H α,R (Ω) := {p ∈ H α (Ω) : ||p|| H α (Ω) ≤ R}. Then Theorem 4.1 in Ohn and Kim (2019) states the existence of positive constants L 0 , N 0 , Σ 0 , B 0 depending only on i, α, R and ς s.t. for any > 0, the neural network Figure A1 shows the upper bound Σ on the network sparsity, |θ| 0 ≤ Σ, as a function of the error tolerance and Hölder smoothness, α, of the function being approximated.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The previous works have demonstrated that the locally nonlinear activation functions consistently outperformed multilinear activation functions across the different network architectures on visual recognition tasks. This is proven by a recent theoretical study of Ohn and Kim (2019) that the locally non-linear region can promote better expressivity and non-linear approximation capability.…”
Section: Locally Non-linearmentioning
confidence: 93%
“…Actually, the Rectified Linear Units (ReLU) activation function is the most popular choice in practical use of the neural network [12]. In this reason, most of the recent results on the universal approximation theory is about the ReLU network [5,[13][14][15][16][17][18][19][20]. Cohen et al [13] provided the deep convolutional neural network with the ReLU activation function that cannot be realized by a shallow network if the number of nodes of its hidden layer is no more than an exponential bound.…”
Section: Introductionmentioning
confidence: 99%