2019
DOI: 10.1016/j.neunet.2018.11.005
|View full text |Cite
|
Sign up to set email alerts
|

A comparison of deep networks with ReLU activation function and linear spline-type methods

Abstract: Deep neural networks (DNNs) generate much richer function spaces than shallow networks. Since the function spaces induced by shallow networks have several approximation theoretic drawbacks, this explains, however, not necessarily the success of deep networks. In this article we take another route by comparing the expressive power of DNNs with ReLU activation function to linear spline methods. We show that MARS (multivariate adaptive regression splines) is improper learnable by DNNs in the sense that for any gi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
139
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 323 publications
(141 citation statements)
references
References 13 publications
2
139
0
Order By: Relevance
“…One main difference between our model and RED is the improvement in the activation functions, which was changed from a Rectified Linear Unit (ReLU) to a Leaky Rectified Linear Unit (LeakyReLU) for all convolutional and deconvolutional layers except the final deconvolutional layer, which use Clipped Rectified Linear Unit as the activation function to limit values in the range [0, 1]. Previous studies have shown that different activation functions have an impact on the final performance of a CNN [ 67 , 68 ]. Hence, the improvement in the activation functions contributed to the better image restoration by the SR network.…”
Section: Methodsmentioning
confidence: 99%
“…One main difference between our model and RED is the improvement in the activation functions, which was changed from a Rectified Linear Unit (ReLU) to a Leaky Rectified Linear Unit (LeakyReLU) for all convolutional and deconvolutional layers except the final deconvolutional layer, which use Clipped Rectified Linear Unit as the activation function to limit values in the range [0, 1]. Previous studies have shown that different activation functions have an impact on the final performance of a CNN [ 67 , 68 ]. Hence, the improvement in the activation functions contributed to the better image restoration by the SR network.…”
Section: Methodsmentioning
confidence: 99%
“…One of the most promising new areas in artificial intelligence (AI) and machine learning for building predictive models are the so‐called deep learning technologies, [ 17 ] which are extensions of the traditional neural networks architectures, using more hidden layers and a larger variety of activation functions, [ 18 ] that is, functions that map the input to the output response of each neuron. Convolutional neural networks (CNNs) have shown state‐of‐the‐art performance for image classification, segmentation, and object detection and tracking.…”
Section: Introductionmentioning
confidence: 99%
“…The values of the mask were then determined to minimize the loss through iterative learning. In this study, as an activation function, the rectified linear unit (ReLU) function [43][44][45][46] rather than the Sigmoid function is used as shown in Figure 7. That is because a vanishing gradient (in which a gradient converges to zero) occurs if the Sigmoid function is used [47][48][49].…”
Section: Detection Of Target Regionmentioning
confidence: 99%