2023
DOI: 10.3390/app13106056
|View full text |Cite
|
Sign up to set email alerts
|

Audio–Visual Sound Source Localization and Tracking Based on Mobile Robot for The Cocktail Party Problem

Abstract: Locating the sound source is one of the most important capabilities of robot audition. In recent years, single-source localization techniques have increasingly matured. However, localizing and tracking specific sound sources in multi-source scenarios, which is known as the cocktail party problem, is still unresolved. In order to address this challenge, in this paper, we propose a system for dynamically localizing and tracking sound sources based on audio–visual information that can be deployed on a mobile robo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 43 publications
0
2
0
Order By: Relevance
“…Another line focuses on finding a better visual appearance model to track multiple speakers in indoor environments [ 34 ]. At the same time, other proposals are centered on audiovisual tracking in compact configurations (co-located camera and microphone array) for applications such as human-robot interaction [ 17 , 20 , 21 , 35 ].…”
Section: Previous Workmentioning
confidence: 99%
“…Another line focuses on finding a better visual appearance model to track multiple speakers in indoor environments [ 34 ]. At the same time, other proposals are centered on audiovisual tracking in compact configurations (co-located camera and microphone array) for applications such as human-robot interaction [ 17 , 20 , 21 , 35 ].…”
Section: Previous Workmentioning
confidence: 99%
“…The latency of generating DOA estimates will limit how quickly a humanoid robot can respond to movement of a current talker or orient towards a new talker. Works such as [5,6] consider accurate DOA estimation on robotic systems, but also require a consideration for latency and turn-taking in the context of human-robot conversational scenarios.…”
Section: Introductionmentioning
confidence: 99%