2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 2022
DOI: 10.1109/avss56176.2022.9959632
|View full text |Cite
|
Sign up to set email alerts
|

Improved Separation of Closely-spaced Speakers by Exploiting Auxiliary Direction of Arrival Information within a U-Net Architecture

Abstract: Microphone arrays use spatial diversity for separating concurrent audio sources. Source signals from different directions of arrival (DOAs) are captured with DOAdependent time-delays between the microphones. These can be exploited in the short-time Fourier transform domain to yield time-frequency masks that extract a target signal while suppressing unwanted components. Using deep neural networks (DNNs) for mask estimation has drastically improved separation performance. However, separation of closely spaced so… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

1
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 28 publications
1
0
0
Order By: Relevance
“…This avoids an implicit far-field assumption and leads to better performance, as we showed in [26]. Similarly, Kindt et al [31] have shown that a learned encoding based on a one-hot encoded angle used as a feature to improve separation of closely spaced speakers is more valuable than a hand-crafted feature based on expected phase differences.…”
Section: Introductionsupporting
confidence: 56%
“…This avoids an implicit far-field assumption and leads to better performance, as we showed in [26]. Similarly, Kindt et al [31] have shown that a learned encoding based on a one-hot encoded angle used as a feature to improve separation of closely spaced speakers is more valuable than a hand-crafted feature based on expected phase differences.…”
Section: Introductionsupporting
confidence: 56%