Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-733
|View full text |Cite
|
Sign up to set email alerts
|

Neural Network-Based Spectrum Estimation for Online WPE Dereverberation

Abstract: In this paper, we propose a novel speech dereverberation framework that utilizes deep neural network (DNN)-based spectrum estimation to construct linear inverse filters. The proposed dereverberation framework is based on the state-of-the-art inverse filter estimation algorithm called weighted prediction error (WPE) algorithm, which is known to effectively reduce reverberation and greatly boost the ASR performance in various conditions. In WPE, the accuracy of the inverse filter estimation, and thus the deverbe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
99
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 88 publications
(99 citation statements)
references
References 24 publications
0
99
0
Order By: Relevance
“…We compare the performance with the state-of-the-art dereverberation method called Weighted Prediction Error (WPE), which is known to effectively reduce reverberation and greatly boosts the speech enhancement performance. We used the more recent version of WPE [29] which is also based on DNN [30]. However, WPE uses a different architecture based on LSTM.…”
Section: Reference and Performance Assessmentmentioning
confidence: 99%
“…We compare the performance with the state-of-the-art dereverberation method called Weighted Prediction Error (WPE), which is known to effectively reduce reverberation and greatly boosts the speech enhancement performance. We used the more recent version of WPE [29] which is also based on DNN [30]. However, WPE uses a different architecture based on LSTM.…”
Section: Reference and Performance Assessmentmentioning
confidence: 99%
“…4 in [110] describe the NN, which operates in the log-spectral domain, that is used for pre-cleaning the noisy and reverberant speech signal. The pre-cleaned power spectral domain is then used for the WPE algorithm; a particular feature of the algorithm in [110] is that the WPE method does not need more than one iterations.…”
Section: Additional Literature Reviewmentioning
confidence: 99%
“…The latter showing improvements upon RNN based methods [55] in terms of PESQ and STOI scores. For noise supression and dereverberation, authors in [56] proposed to use an RNN to estimate the power spectral density (PSD) prior to prediction filter estimation and inverse filtering, showing ASR improvements compared to the baseline method that iteratively calculated PSD on the enhanced speech signal. None of these techniques have been evaluated in terms of how well they can improve TTS quality.…”
Section: Our Work In Contextmentioning
confidence: 99%