2016
DOI: 10.1007/978-3-319-46493-0_39
|View full text |Cite
|
Sign up to set email alerts
|

Deep Networks with Stochastic Depth

Abstract: Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very deep networks comes with its own set of challenges. The gradients can vanish, the forward flow often diminishes, and the training time can be painfully slow. To address these problems, we propose stochastic depth, a training procedure that enables the seemingly contradictory se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

9
1,203
0
2

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 1,526 publications
(1,214 citation statements)
references
References 23 publications
9
1,203
0
2
Order By: Relevance
“…The comparison is provided in Table 2. Our method has some accuracy degradation in comparison to state-of-the-art supervised publication [45], which has increased the considerable depth of residual networks even beyond 1200 layers. The layers of 1200 are an astronomical figure.…”
Section: Methodsmentioning
confidence: 96%
See 2 more Smart Citations
“…The comparison is provided in Table 2. Our method has some accuracy degradation in comparison to state-of-the-art supervised publication [45], which has increased the considerable depth of residual networks even beyond 1200 layers. The layers of 1200 are an astronomical figure.…”
Section: Methodsmentioning
confidence: 96%
“…Accuracy L2 sparse filtering [40] 63.89% 3-way factored RBM (3 layers) [41] 65.30% Mc RBM (3 layers) [42] 71.00% Tiled CNN [43] 73.10% Stochastic pooling ConvNet [44] 84.87% Deep networks with stochastic depth [45] 95.09% Our method 74.18% different feature numbers and sparsity parameter values.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The ensemble method is also an effective technique for reducing generalization errors by combining multiple models. Huang et al [31] recently developed a training algorithm that drops a random subset of layers into a deep residual network [20] and achieves good performance in reducing overfitting. Bastien et al [32] developed a powerful generator of stochastic variations for handwritten character images.…”
Section: Related Studiesmentioning
confidence: 99%
“…The ResNet was also shown to alleviate optimization difficulty in training very deep networks. In [34], identity shortcut connections were used for bypassing a subset of layers to facilitate training very deep networks.…”
Section: Related Workmentioning
confidence: 99%