Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2439
|View full text |Cite
|
Sign up to set email alerts
|

INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
37
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 16 publications
(39 citation statements)
references
References 22 publications
1
37
1
Order By: Relevance
“…Parts of this work, namely one of our two proposed losses, have been pre-published with limited analysis and evaluation in [49]. Following up our proposed CLs, Xia et al proposed a modified components loss for speech enhancement in [50], and Strake et al proposed a component loss for a joint denoising and dereverberation task in [51].…”
Section: Introductionmentioning
confidence: 99%
“…Parts of this work, namely one of our two proposed losses, have been pre-published with limited analysis and evaluation in [49]. Following up our proposed CLs, Xia et al proposed a modified components loss for speech enhancement in [50], and Strake et al proposed a component loss for a joint denoising and dereverberation task in [51].…”
Section: Introductionmentioning
confidence: 99%
“…Due to the promising performance of convolutional encoderdecoder topology in the SE applications [20,21,22], we employ it in the first three modules. As both DN-Net and DR-Net are operated in the magnitude domain, only one decoder is utilized to recover the magnitude.…”
Section: Network Configurationsmentioning
confidence: 99%
“…The Deep Noise Suppression (DNS) Challenge for Interspeech 2020 (DNS1) [18], ICASSP 2021 (DNS2) [19], and Interspeech 2021 (DNS3) [20] addressed a challenging task for joint dereverberation and denoising. Strake et al [21] achieved a second rank in the DNS1 non-realtime track by training an actually realtime-capable FCRN for complex spectral maskingbased denoising and dereverberation, which we build upon. All of the abovementioned works apply supervised training using synthetic data, where clean speech and noise are prepared separately and are mixed by addition, while reverberation (if considered) is applied by convolution with simulated or recorded room impulse responses (RIRs).…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, we solved the problems addressed in [29] by training the FCRN and the PESQNet alternatingly, similar to alternating training schemes used in adversarial trainings [22][23][24]. We combine the loss provided by the PESQNet and the successful FCRN loss from [21], thereby training a complex mask-based FCRN in a "weakly" supervised way, employing both synthetic and real data in training.…”
Section: Introductionmentioning
confidence: 99%