2021
DOI: 10.48550/arxiv.2102.04198
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and Phase Optimization with a Two-Stage Deep Network

Abstract: It remains a tough challenge to recover the speech signals contaminated by various noises under real acoustic environments. To this end, we propose a novel system for denoising in the complicated applications, which is mainly comprised of two pipelines, namely a two-stage network and a post-processing module. The first pipeline is proposed to decouple the optimization problem w.r.t. magnitude and phase, i.e., only the magnitude is estimated in the first stage and both of them are further refined in the second … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…To capture the long-term temporal dependencies, we insert cascaded temporal convolutional modules (TCMs) [10] in the bottleneck. To decrease the parameters, we opt to the squeezed version [15,16], i.e., S-TCM, where the feature size is first compressed into 64 rather than 512 as the literature stated [10], followed by dilated convolutions. For each stage, we stack three groups of TCMs, each of which includes 6 S-TCMs with dilation rate d = {1, 2, 4, 8, 16, 32}.…”
Section: Network Configurationsmentioning
confidence: 99%
See 4 more Smart Citations
“…To capture the long-term temporal dependencies, we insert cascaded temporal convolutional modules (TCMs) [10] in the bottleneck. To decrease the parameters, we opt to the squeezed version [15,16], i.e., S-TCM, where the feature size is first compressed into 64 rather than 512 as the literature stated [10], followed by dilated convolutions. For each stage, we stack three groups of TCMs, each of which includes 6 S-TCMs with dilation rate d = {1, 2, 4, 8, 16, 32}.…”
Section: Network Configurationsmentioning
confidence: 99%
“…We compare the proposed framework with five advanced SE systems, namely GCRN [8], DCCRN [34], TSCN [16], AECNN [35], and Conv-TasNet [10]. GCRN, DCCRN, and TSCN are complex-domain based approaches, which aim to recover both magnitude and phase information simultaneously.…”
Section: Baselinesmentioning
confidence: 99%
See 3 more Smart Citations