2020
DOI: 10.3390/ai1040030
|View full text |Cite
|
Sign up to set email alerts
|

A Biologically Motivated, Proto-Object-Based Audiovisual Saliency Model

Abstract: The natural environment and our interaction with it are essentially multisensory, where we may deploy visual, tactile and/or auditory senses to perceive, learn and interact with our environment. Our objective in this study is to develop a scene analysis algorithm using multisensory information, specifically vision and audio. We develop a proto-object-based audiovisual saliency map (AVSM) for the analysis of dynamic natural scenes. A specialized audiovisual camera with 360∘ field of view, capable of locating so… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 75 publications
0
2
0
Order By: Relevance
“…This method measures the spatial support of a particular class in each image via a heatmap. Saliency maps have been used for region-of-interest extraction [ 47 ], medical imaging [ 48 , 49 ], robot vision [ 50 ], and audio-visual integration [ 51 , 52 ], in addition to AD diagnosis based on MRI [ 32 ]. Saliency maps are obtained by computing the gradient of the output category in relation to the input image.…”
Section: Methodsmentioning
confidence: 99%
“…This method measures the spatial support of a particular class in each image via a heatmap. Saliency maps have been used for region-of-interest extraction [ 47 ], medical imaging [ 48 , 49 ], robot vision [ 50 ], and audio-visual integration [ 51 , 52 ], in addition to AD diagnosis based on MRI [ 32 ]. Saliency maps are obtained by computing the gradient of the output category in relation to the input image.…”
Section: Methodsmentioning
confidence: 99%
“…Quick memorization of encountered configurations can efficiently seed the process of subsequent grinding-in. It has been demonstrated that already at a rudimentary level of proto-objects, the linear combination of visual and auditory feature conspicuity maps, when incorporating gestalt-properties of proximity, convexity, and surroundedness, allows to capture a higher number of valid salient events than uni-sensory saliency maps [35,36] (4,5/D,E). The combined effect of bottom-up activation and top-down bias in living beings has been found in many experiments, e.g., during visual search [37,38].…”
Section: /C)mentioning
confidence: 99%