2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8462430
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning Based Speech Beamforming

Abstract: Multi-channel speech enhancement with ad-hoc sensors has been a challenging task. Speech model guided beamforming algorithms are able to recover natural sounding speech, but the speech models tend to be oversimplified or the inference would otherwise be too complicated. On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform efficient inference, but they are unable to deal with variable number of input channels. Also, deep learning approaches… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
81
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 64 publications
(81 citation statements)
references
References 24 publications
0
81
0
Order By: Relevance
“…Other methods that are not part of these three main categories include ones that combine neural networks and beamforming in different ways. For example, [28] uses a single-channel speech enhancement network to first estimate the source of interest and then applies time-domain Wienerfiltering based beamforming.…”
Section: Introductionmentioning
confidence: 99%
“…Other methods that are not part of these three main categories include ones that combine neural networks and beamforming in different ways. For example, [28] uses a single-channel speech enhancement network to first estimate the source of interest and then applies time-domain Wienerfiltering based beamforming.…”
Section: Introductionmentioning
confidence: 99%
“…Based on the observation in Table 1, we only report the SNR-A2T results with α = 0.3, 1, 3, as they all achieve on par or better separation performance than the standard SNR while have a much lower distortion The definition of direct-path signal can vary in different literatures. Defining the direct-path RIR filter as ±6 ms of the first peak in the RIR filter is the same as [42], however the range can be relaxed to cover more early reverberation components similar to [21]. Here we also investigate the effect of different definitions of direct-path signals.…”
Section: Resultsmentioning
confidence: 99%
“…In recent years, research on DNNs has become more popular and DNN-based source separation methods have been proposed [17][18][19][20][21]. IDLMA [17] has been proposed for determined audio source separation.…”
Section: A Motivation and Frameworkmentioning
confidence: 99%