2014
DOI: 10.3389/fncom.2014.00158
|View full text |Cite
|
Sign up to set email alerts
|

A conceptual framework of computations in mid-level vision

Abstract: If a picture is worth a thousand words, as an English idiom goes, what should those words—or, rather, descriptors—capture? What format of image representation would be sufficiently rich if we were to reconstruct the essence of images from their descriptors? In this paper, we set out to develop a conceptual framework that would be: (i) biologically plausible in order to provide a better mechanistic understanding of our visual system; (ii) sufficiently robust to apply in practice on realistic images; and (iii) a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
26
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 26 publications
(26 citation statements)
references
References 191 publications
(226 reference statements)
0
26
0
Order By: Relevance
“…Could magnocellular transient processes suffice to support this ability, perhaps explaining how even in LG's case of hindered downstream flow, such "vision at a glance" might function normally? Interestingly, a recent framework for midlevel vision computations (Kubilius et al, 2014) proposes that the gist of the scene can be achieved in a pre-attentive manner by pooling together information from multiple sources (color, orientation, natural statistics of the scene and so on). For such a framework to support "vision at a glance" (Hochstein and Ahissar, 2002;Kubilius et al, 2014), it would be reasonable to assume that such pre-attentive computations should occur fast and based on coarse spatial resolution, as is the case with the magnocellular pathway (Schmolesky et al, 1998;Lamme and Roelfsema, 2000) which appears to function normally in LG (Gilaie-Dotan et al, 2009, 2011.…”
Section: Open Questionsmentioning
confidence: 99%
See 2 more Smart Citations
“…Could magnocellular transient processes suffice to support this ability, perhaps explaining how even in LG's case of hindered downstream flow, such "vision at a glance" might function normally? Interestingly, a recent framework for midlevel vision computations (Kubilius et al, 2014) proposes that the gist of the scene can be achieved in a pre-attentive manner by pooling together information from multiple sources (color, orientation, natural statistics of the scene and so on). For such a framework to support "vision at a glance" (Hochstein and Ahissar, 2002;Kubilius et al, 2014), it would be reasonable to assume that such pre-attentive computations should occur fast and based on coarse spatial resolution, as is the case with the magnocellular pathway (Schmolesky et al, 1998;Lamme and Roelfsema, 2000) which appears to function normally in LG (Gilaie-Dotan et al, 2009, 2011.…”
Section: Open Questionsmentioning
confidence: 99%
“…Interestingly, a recent framework for midlevel vision computations (Kubilius et al, 2014) proposes that the gist of the scene can be achieved in a pre-attentive manner by pooling together information from multiple sources (color, orientation, natural statistics of the scene and so on). For such a framework to support "vision at a glance" (Hochstein and Ahissar, 2002;Kubilius et al, 2014), it would be reasonable to assume that such pre-attentive computations should occur fast and based on coarse spatial resolution, as is the case with the magnocellular pathway (Schmolesky et al, 1998;Lamme and Roelfsema, 2000) which appears to function normally in LG (Gilaie-Dotan et al, 2009, 2011. Another such processing route that might also be normal in LG (see place recognition in Table 2), might rely on peripheral form-based computations, which are suggested to respond transiently (Gilaie-Dotan et al, 2008) and rely on coarse spatial resolution (Levy et al, 2001(Levy et al, , 2004Dumoulin and Wandell, 2008).…”
Section: Open Questionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Consequently, while computer science typically employs solutions that rely only seldom on previous neuroscientific knowledge, and its goal is to maximize task accuracy (e.g., with deep learning), visual neuroscience somehow lacks of solid computational models and formal explanations, ending up with several arbitrary assumptions in modeling, especially for mid-level vision processing, such as scene segmentation or shape features extraction (for a definition see: Kubilius J et al 2014).…”
Section: Facing the Challenge Of Explicit Modeling In Visual Neuroscimentioning
confidence: 99%
“…First, the low-level description of the stimuli was grounded on features extracted by the early visual cortex (i.e., image contrast and spatial frequencies). Second, since shape is critical to interact with the surrounding environment 17 , we relied on a well-assessed, physiologically-motivated description of shape, i.e., the medial axis 18 . Finally, objects were also distinctively represented according to their superordinate categories.…”
Section: Introductionmentioning
confidence: 99%