2016
DOI: 10.1609/aaai.v30i1.10243
|View full text |Cite
|
Sign up to set email alerts
|

On the Depth of Deep Neural Networks: A Theoretical View

Abstract: People believe that depth plays an important role in success of deep neural networks (DNN). However, this belief lacks solid theoretical justifications as far as we know. We investigate role of depth from perspective of margin bound. In margin bound, expected error is upper bounded by empirical margin error plus Rademacher Average (RA) based capacity term. First, we derive an upper bound for RA of DNN, and show that it increases with increasing depth. This indicates negative impact of depth on test performance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 83 publications
(15 citation statements)
references
References 27 publications
0
15
0
Order By: Relevance
“…Why do deeper networks perform worse in this task than the VGG13? In a recent theoretical study (Sun et al, 2016 ) of the depth of neural networks, it is shown that as the network deepens, the Rademacher Average (RA; a measurement of complexity) increases accordingly, which has some negative effects on the network. From the information acquisition point of view, deeper networks may be overfitting in learning and learning something unimportant, so the deeper networks do not seem to be able to perform simple tasks on par with computationally less sophisticated networks.…”
Section: Discussionmentioning
confidence: 99%
“…Why do deeper networks perform worse in this task than the VGG13? In a recent theoretical study (Sun et al, 2016 ) of the depth of neural networks, it is shown that as the network deepens, the Rademacher Average (RA; a measurement of complexity) increases accordingly, which has some negative effects on the network. From the information acquisition point of view, deeper networks may be overfitting in learning and learning something unimportant, so the deeper networks do not seem to be able to perform simple tasks on par with computationally less sophisticated networks.…”
Section: Discussionmentioning
confidence: 99%
“…For instance, the deep network shown in Figure 1 b involves complex computations with many parameters and intermediate data with high latency and energy consumption, which is not appropriate for low-cost resource-constrained devices. There is a tradeoff between these performances, however; deeper networks or increasing the depth of networks is not always good [ 40 ]. Inspired by this, we have conducted model reduction for the model to be lightweight by reducing the depth of the model and involving the intermediate maxpool functions in convolutions with less computation complexity and low latency.…”
Section: Proposed Model Optimizationmentioning
confidence: 99%
“…For neural networks, however, the hypothesis space is large and combinatorially explodes in size with the neural network width and depth, making the corresponding bounds loose (cf. Bartlett et al, 1998;Harvey et al, 2017;Bartlett et al, 2019;Sun et al, 2016). Uniform bounds that utilize the parametric characterization of the network rapidly with the size of the neural network (e.g., Neyshabur et al, 2015).…”
Section: Related Workmentioning
confidence: 99%