2019
DOI: 10.1101/2019.12.16.877753
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Depth in convolutional neural networks solves scene segmentation

Abstract: Feedforward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes.This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional visu… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 52 publications
0
3
0
Order By: Relevance
“…First, although feedforward networks can in principle implement any function [54], recurrent networks can implement certain functions more efficiently. Flexible grouping and segmentation is exactly the kind of function that may benefit from recurrent computations (see also [55]). For example, to determine which local elements should be grouped into a global object, it helps to compute the global object first.…”
Section: Discussionmentioning
confidence: 99%
“…First, although feedforward networks can in principle implement any function [54], recurrent networks can implement certain functions more efficiently. Flexible grouping and segmentation is exactly the kind of function that may benefit from recurrent computations (see also [55]). For example, to determine which local elements should be grouped into a global object, it helps to compute the global object first.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, a multitude of studies have reconciled these seemingly inconsistent findings by indicating that recurrent processes might be employed adaptively, depending on the visual input: while feed-forward activity might suffice for simple scenes with isolated objects, more complex scenes or more challenging conditions (e.g. objects that are occluded or degraded), may need additional visual operations (‘routines’) requiring recurrent computations (Groen et al, 2018; Tang et al, 2018; Kar et al, 2019; Rajaei et al, 2019; Seijdel et al, 2020). For objects in isolation, or very simple scenes, rapid recognition may thus rely on a coarse and unsegmented feed-forward representation (Crouzet and Serre, 2011), while for more cluttered images recognition may require explicit encoding of spatial relationships between parts.…”
Section: Introductionmentioning
confidence: 99%
“…Intriguingly, these networks not only parallel human performance on some object recognition tasks (VanRullen, 2017), but they also feature processing characteristics that bear a lot of resemblance to the visual ventral stream in primates (Eickenberg et al, 2017; Güçclü and van Gerven, 2015; Khaligh-Razavi and Kriegeskorte, 2014; Kubilius et al, 2018; Schrimpf et al, 2020; Yamins et al, 2014). Leveraging this link between neural processing and performance has already enhanced insight into the potential mechanisms underlying shape perception (Kubilius et al,2016), scene segmentation (Seijdel et al, 2020) and the role of recurrence during object recognition (Kar et al, 2019; Kietzmann et al, 2019b). DCNNs may thus provide a promising avenue for systematically investigating how different attention mechanisms may modulate neural processing and thereby, performance.…”
Section: Introductionmentioning
confidence: 99%