ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414958
|View full text |Cite
|
Sign up to set email alerts
|

Deep Residual Echo Suppression With A Tunable Tradeoff Between Signal Distortion And Echo Suppression

Abstract: In this paper, we propose a residual echo suppression method using a UNet neural network that directly maps the outputs of a linear acoustic echo canceler to the desired signal in the spectral domain. This system embeds a design parameter that allows a tunable tradeoff between the desired-signal distortion and residual echo suppression in double-talk scenarios. The system employs 136 thousand parameters, and requires 1.6 Giga floating-point operations per second and 10 Mega-bytes of memory. The implementation … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…To evaluate the performances of the DSML and RESL, we employ a deep learning-based RES system that embeds a tunable design parameter [20]. This system comprises a UNet neural network [21] with two input channels and one output channel.…”
Section: Res System With a Design Parametermentioning
confidence: 99%
See 4 more Smart Citations
“…To evaluate the performances of the DSML and RESL, we employ a deep learning-based RES system that embeds a tunable design parameter [20]. This system comprises a UNet neural network [21] with two input channels and one output channel.…”
Section: Res System With a Design Parametermentioning
confidence: 99%
“…Additional real recordings were conducted in our lab to test the generalization of the DSML and RESL to unseen setups and their robustness to extremely low levels of SERs. This database is fully described in [20]. For completion, it contains 40 h of recordings from the TIMIT [24] and LibriSpeech [25] corpora with SNR levels of 32 ± 5 dB and SER levels distributed on [−20, −10] dB.…”
Section: Databasementioning
confidence: 99%
See 3 more Smart Citations