2017
DOI: 10.48550/arxiv.1710.04773
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Residual Connections Encourage Iterative Inference

Abstract: Residual networks (Resnets) have become a prominent architecture in deep learning. However, a comprehensive understanding of Resnets is still a topic of ongoing research. A recent view argues that Resnets perform iterative refinement of features. We attempt to further expose properties of this aspect. To this end, we study Resnets both analytically and empirically. We formalize the notion of iterative refinement in Resnets by showing that residual connections naturally encourage features of residual blocks to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
34
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(35 citation statements)
references
References 8 publications
1
34
0
Order By: Relevance
“…Such phenomenon shows that the first several blocks in a C-block have relatively higher LCMC than the others. The discovery gained here is consistent with the reports from previous works Veit et al [2016]; Jastrzebski et al [2017].…”
Section: Comparisons and Discussionsupporting
confidence: 94%
“…Such phenomenon shows that the first several blocks in a C-block have relatively higher LCMC than the others. The discovery gained here is consistent with the reports from previous works Veit et al [2016]; Jastrzebski et al [2017].…”
Section: Comparisons and Discussionsupporting
confidence: 94%
“…Relatedly, residual blocks, a popular architectural pattern in feedforward models, might provide an inductive bias similar to recurrence (Liao & Poggio, 2016), enabling an equivalent form of iterative processing, which could explain their increased efficacy in categorization (compared to vanilla feedforward models), especially for ambiguous images (Jastrzębski et al, 2017).…”
Section: Discussionmentioning
confidence: 99%
“…Since this first block is different from the rest, it cannot be folded. These projection blocks could be eliminated, as explored in [21] and [15], but following the intuition that these dimensional transformations correspond to compositional changes in the level of representation [5], this paper keeps them. The rest of the blocks are folded into a single recurrent block, which is iterated a number of times equal to the number of blocks folded (see Figure 1c).…”
Section: Proposed Method: Hidden-fold Networkmentioning
confidence: 99%
“…However, folded ResNets suffer from overfitting and exploding layer activations. Reference [15] suggested using unshared batch normalization (UBN) to alleviate this problem. UBN consists of not sharing batch normalization's learnable parameters (β and γ) between folded blocks, having a different set for each iteration instead, as exemplified in Figure 3.…”
Section: Proposed Method: Hidden-fold Networkmentioning
confidence: 99%
See 1 more Smart Citation