2019
DOI: 10.1101/747394
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Capsule Networks as Recurrent Models ofGrouping and Segmentation

Abstract: 12Classically, visual processing is described as a cascade of local feedforward computations. Feedforward 13Convolutional Neural Networks (ffCNNs) have shown how powerful such models can be. Previously, 14 using visual crowding as a well-controlled challenge, we showed that no classic model of vision, 15including ffCNNs, can explain human global shape processing (1). Here, we show that Capsule Neural 16 Networks (CapsNets; 2), combining ffCNNs with a grouping and segmentation mechanism, solve this 17 challenge… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

2
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 50 publications
2
3
0
Order By: Relevance
“…Our results contribute to the expanding literature showing that there is much more to vision than combining local feature detectors in a feedforward hierarchical manner (Baker et al, 2018;Brendel & Bethge, 2019;Doerig, Bornet, et al, 2019;Doerig, Schmittwilken, et al, 2019;Funke et al, 2018;Kar, Kubilius, Schmidt, Issa, & DiCarlo, 2019;Kietzmann et al, 2019;Kim, Linsley, Thakkar In line with the present findings, many studies have highlighted other fundamental differences between ffCNNs and humans in local vs. global processing. For example, Baker et al (2018) showed that ffCNNs but not humans are affected by local changes to edges and textures of objects.…”
Section: Discussionsupporting
confidence: 89%
See 3 more Smart Citations
“…Our results contribute to the expanding literature showing that there is much more to vision than combining local feature detectors in a feedforward hierarchical manner (Baker et al, 2018;Brendel & Bethge, 2019;Doerig, Bornet, et al, 2019;Doerig, Schmittwilken, et al, 2019;Funke et al, 2018;Kar, Kubilius, Schmidt, Issa, & DiCarlo, 2019;Kietzmann et al, 2019;Kim, Linsley, Thakkar In line with the present findings, many studies have highlighted other fundamental differences between ffCNNs and humans in local vs. global processing. For example, Baker et al (2018) showed that ffCNNs but not humans are affected by local changes to edges and textures of objects.…”
Section: Discussionsupporting
confidence: 89%
“…For this principled reason, we propose that ffCNNs cannot produce uncrowding in general, independently of the specific ffCNN, training procedure and loss function. In support of this proposal, we showed in a separate contribution that ffCNNs specifically trained on classifying verniers and flanking shapes, as well as counting the number of flankers, do not produce global (un)crowding either (Doerig, Schmittwilken, Sayim, Manassi, & Herzog, 2019).…”
Section: Discussionsupporting
confidence: 64%
See 2 more Smart Citations
“…Some related work on large artificial networks in linguistics (e.g., Vankov and Bowers, 2020;Jiang et al, 2021;Kim and Smolensky, 2021) suggests some strategies for combining extensive associative training with symbolic processing. In vision, capsule networks (Sabour et al, 2018) include some relational coding and have been shown to increase configural sensitivity in uncrowding effects (Doerig et al, 2020). Another recent model adds external memory to a recurrent DCNN to allow for explicit symbolic processing, resulting in rapid abstract rule learning (Webb et al, 2021).…”
Section: Discussionmentioning
confidence: 99%