2020
DOI: 10.1007/978-3-030-64919-7_17
|View full text |Cite
|
Sign up to set email alerts
|

Visual Search as Active Inference

Abstract: Visual search is an essential cognitive ability, offering a prototypical control problem to be addressed with Active Inference. Under a Naive Bayes assumption, the maximization of the information gain objective is consistent with the separation of the visual sensory flow in two independent pathways, namely the "What" and the "Where" pathways. On the "What" side, the processing of the central part of the visual field (the fovea) provides the current interpretation of the scene, here the category of the target. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…The functional relevance of this is that the probability density for x v , when translated into polar or Cartesian coordinates, will assign higher variance to more eccentric values. This has been proposed as an explanation for the increased variance of saccadic endpoints for more eccentric locations in a Cartesian frame, despite unform variance in log-polar reference frames (Daucé and Perrinet, 2020).…”
Section: The Brainstemmentioning
confidence: 98%
“…The functional relevance of this is that the probability density for x v , when translated into polar or Cartesian coordinates, will assign higher variance to more eccentric values. This has been proposed as an explanation for the increased variance of saccadic endpoints for more eccentric locations in a Cartesian frame, despite unform variance in log-polar reference frames (Daucé and Perrinet, 2020).…”
Section: The Brainstemmentioning
confidence: 98%
“…Taking into account the architecture of POLO ATN, the notion of "Foveated" Spatial Transformers comes to light (see Fig. 5); wholly based on specially modified attention-only spatial transformers [11], they integrate the biological realism and the computational efficiency of a Log-Polar based artificial vision system like the recent What/Where Model [16], [18], alongside the easiness of learning of spatial transformers of different translations in objects inside images, all this happens during classification without any annotation added to the training procedure.…”
Section: Discussionmentioning
confidence: 99%
“…In that model, the ventral and dorsolateral pathways are responsible for object vision and spatial localization, respectively [17]. Captured within an active inference framework [18]- [20], this model works in a sequential way. A first and key aspect of this artificial visual processing setup is the compression of the visual data through a center-surround log-polar grid representation; as is the case of the foveated vision in mammals [21].…”
Section: Introductionmentioning
confidence: 99%
“…However, recent work also focused on active vision. In [26] a generative model learning representations of a whole 3D scene was used for an active inference agent, whereas in [2] an explicit what and where stream were modeled for classifying MNIST digits.…”
Section: Related Workmentioning
confidence: 99%