2020
DOI: 10.1109/taslp.2020.2987441
|View full text |Cite
|
Sign up to set email alerts
|

DeepMMSE: A Deep Learning Approach to MMSE-Based Noise Power Spectral Density Estimation

Abstract: An accurate noise power spectral density (PSD) tracker is an indispensable component of a single-channel speech enhancement system. Bayesian-motivated minimum mean-square error (MMSE)-based noise PSD estimators have been the most prominent in recent time. However, they lack the ability to track highly non-stationary noise sources due to current methods of a priori signal-to-noise (SNR) estimation. This is caused by the underlying assumption that the noise signal changes at a slower rate than the speech signal.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
93
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 115 publications
(99 citation statements)
references
References 40 publications
0
93
0
Order By: Relevance
“…Deep neural network: ResNet-TCN A modified version of the residual network (ResNet) TCN from Zhang et al (2020) is used to evaluate each training target. 3 The set of hyperparameters for ResNet-TCN used in this work are derived from Zhang et al (2020). It is shown from input to output in Figure 2.…”
Section: Experiments Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep neural network: ResNet-TCN A modified version of the residual network (ResNet) TCN from Zhang et al (2020) is used to evaluate each training target. 3 The set of hyperparameters for ResNet-TCN used in this work are derived from Zhang et al (2020). It is shown from input to output in Figure 2.…”
Section: Experiments Setupmentioning
confidence: 99%
“…unit Each block contains three one-dimensional causal dilated convolutional units. Here, we modify the preactivation of the convolutional units in Zhang et al (2020) by using the rectifier linear activation function followed by layer normalisation without the scale and shift operations (again following Xu et al (2019)). The kernel size, output size, and dilation rate for each convolutional unit is denoted in Figure 2 as (kernel size, output size, dilation rate).…”
Section: Conv1dmentioning
confidence: 99%
“…A modified version of the residual network (ResNet) TCN from (Zhang et al, 2020) is used to evaluate each training target. 3 It is shown from input to output in Figure 2.…”
Section: A Deep Neural Network: Resnet Tcnmentioning
confidence: 99%
“…The input is first transformed by FC, a fully-connected layer of size d model = 256. Instead of applying layer normalisation (Ba et al, 2016) followed by the rectifier linear function to FC, as in (Zhang et al, 2020), we apply the rectifier linear activation function followed by layer normalisation without the scale and shift operations. This reduces overfitting, as demonstrated in (Xu et al, 2019).…”
Section: A Deep Neural Network: Resnet Tcnmentioning
confidence: 99%
See 1 more Smart Citation