2021
DOI: 10.48550/arxiv.2103.13443
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Blind Speech Separation and Dereverberation using Neural Beamforming

Abstract: In this paper, we present the Blind Speech Separation and Dereverberation (BSSD) network, which performs simultaneous speaker separation, dereverberation and speaker identification in a single neural network. Speaker separation is guided by a set of predefined spatial cues. Dereverberation is performed by using neural beamforming, and speaker identification is aided by embedding vectors and triplet mining. We introduce a frequencydomain model which uses complex-valued neural networks, and a time-domain variant… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 47 publications
(93 reference statements)
0
1
0
Order By: Relevance
“…Besides speech enhancement, dereverberation is also a challenging task since it is hard to pinpoint the direct path signal and differentiate it from its copies, especially when reverberation is strong and non-stationary noise is also present. Some previous works have already been done on simultaneous multi-channel enhancement and dereverberation [10,11,12]. However, it is still a relatively hard task for the single-channel scenario due to the lack of spatial information.…”
Section: Introductionmentioning
confidence: 99%
“…Besides speech enhancement, dereverberation is also a challenging task since it is hard to pinpoint the direct path signal and differentiate it from its copies, especially when reverberation is strong and non-stationary noise is also present. Some previous works have already been done on simultaneous multi-channel enhancement and dereverberation [10,11,12]. However, it is still a relatively hard task for the single-channel scenario due to the lack of spatial information.…”
Section: Introductionmentioning
confidence: 99%