2022
DOI: 10.1109/taslp.2022.3156758
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Based Online Multi-Source Separation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 61 publications
0
7
0
Order By: Relevance
“…Ong et al [23] designed a cutting-edge method for real-time online multi-source separation that combines audio and visual data to determine the speakers' locations and separate the intended speech from background noise. They developed a deep neural network that integrates both visual and auditory features to carry out this processing.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ong et al [23] designed a cutting-edge method for real-time online multi-source separation that combines audio and visual data to determine the speakers' locations and separate the intended speech from background noise. They developed a deep neural network that integrates both visual and auditory features to carry out this processing.…”
Section: Related Workmentioning
confidence: 99%
“…Ong et al [23] Real-time online multi-source separation that estimates speaker position and separates target speech from background noise using audio and visual data.…”
Section: The Efficacy Of Noise and Reverberation Removalmentioning
confidence: 99%
“…The target state is represented as Bernoulli RFS, which is empty or has a single element. In [1], generalized labeled Bernoulli filter (GLMB) is employed to solve the problem of multi-modal space-time permutation and deal with the problem of varying number of speakers. In [21], the Poisson multi-Bernoulli mixture (PMBM) filter is proposed for multi-target speaker tracking, which employs Poisson distributions to represent undetected targets and employs a multi-Bernoulli mixture to represent detected targets with different data association strategies.…”
Section: A Speaker Trackingmentioning
confidence: 99%
“…S PEAKER tracking plays an important role in speech separation [1], speech enhancement [2] and speaker diarization [3]. The task of speaker tracking is to estimate the 2D position, 3D position or Direction of Arrival (DOA) of speakers at each time step.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, in [5], spatial search was used to decompose the filtering density into independent GLMBs that are processed in parallel, while, in [21], a parallel centralized implementation for multiple sensors was developed. The GLMB filter has also found a host of applications from robotics [22], sensor networks [23], [24], cell biology [14], [25] to audio/video processing [15], [26].…”
mentioning
confidence: 99%