ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413775
|View full text |Cite
|
Sign up to set email alerts
|

Deep Multi-Frame MVDR Filtering for Single-Microphone Speech Enhancement

Abstract: Multi-frame algorithms for single-microphone speech enhancement, e.g., the multi-frame minimum variance distortionless response (MFMVDR) filter, are able to exploit speech correlation across adjacent time frames in the short-time Fourier transform (STFT) domain. Provided that accurate estimates of the required speech interframe correlation vector and the noise correlation matrix are available, it has been shown that the MFMVDR filter yields a substantial noise reduction while hardly introducing any speech dist… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(8 citation statements)
references
References 45 publications
0
8
0
Order By: Relevance
“…The STOI and PESQ of the noisy and enhanced speeches at -5 dB, 0 dB, and 5 dB are measured , respectively. As shown in Table 1, compared with the MMSE-based approch [18], dual-microphone DNN speech enhancement [19], CRN model [20] and MFMVDR model [21] the proposed method significantly improve the performance. For example, at SNR = 0 dB, the NABDTN method increased STOI by 12.7% and PESQ by 1.08, whereas the MFMVDR only improved STOI by 10.42% and PESQ by 1.02.…”
Section: Resultsmentioning
confidence: 95%
“…The STOI and PESQ of the noisy and enhanced speeches at -5 dB, 0 dB, and 5 dB are measured , respectively. As shown in Table 1, compared with the MMSE-based approch [18], dual-microphone DNN speech enhancement [19], CRN model [20] and MFMVDR model [21] the proposed method significantly improve the performance. For example, at SNR = 0 dB, the NABDTN method increased STOI by 12.7% and PESQ by 1.08, whereas the MFMVDR only improved STOI by 10.42% and PESQ by 1.02.…”
Section: Resultsmentioning
confidence: 95%
“…It is still challenging to reduce the delay to below 4 ms without affecting performance, although some researchers have tried to solve this problem (Vary, 2006; Schröter et al, 2022; Zheng et al, 2022). Tammen & Doclo (2021) proposed a deep multiframe approach to reduce the delay for hearing aids, and this approach was further extended for binaural noise reduction (Tammen & Doclo, 2022). From the application perspective, future work should concentrate on reducing the complexity, storage, and latency of deep-learning methods to facilitate their application in hearing aids and cochlear implants.…”
Section: Conclusion and Future Prospectsmentioning
confidence: 99%
“…Firstly D is trained using (5); as a result the replay buffer now contains both enhanced and de-enhanced data, effectively doubling its size. After D's training, N is trained using (6). Then G is trained as usual.…”
Section: A Metricgan+/-frameworkmentioning
confidence: 99%