Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-1169
|View full text |Cite
|
Sign up to set email alerts
|

On Loss Functions and Recurrency Training for GAN-Based Speech Enhancement Systems

Abstract: Recent work has shown that it is feasible to use generative adversarial networks (GANs) for speech enhancement, however, these approaches have not been compared to state-of-the-art (SOTA) non GAN-based approaches. Additionally, many loss functions have been proposed for GAN-based approaches, but they have not been adequately compared. In this study, we propose novel convolutional recurrent GAN (CRGAN) architectures for speech enhancement. Multiple loss functions are adopted to enable direct comparisons to othe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 32 publications
(23 citation statements)
references
References 33 publications
0
21
0
Order By: Relevance
“…SEGAN Given a noise-corrupted raw audio signalx = x + n ∈ R T , where x ∈ R T denotes a clean signal and n ∈ R T denotes additive background noise, the goal of speech enhancement is to find a mapping f (x) :x → x to recover the clean signal x from the noisy signalx. SEGAN methods [13,14,15] achieve this goal by designating the generator G as the enhancement mapping, i.e.x = G(z,x) where z is a latent variable. The discriminator D is tasked to distinguish the enhanced outputx from the real clean signal x.…”
Section: Self-attention Seganmentioning
confidence: 99%
See 4 more Smart Citations
“…SEGAN Given a noise-corrupted raw audio signalx = x + n ∈ R T , where x ∈ R T denotes a clean signal and n ∈ R T denotes additive background noise, the goal of speech enhancement is to find a mapping f (x) :x → x to recover the clean signal x from the noisy signalx. SEGAN methods [13,14,15] achieve this goal by designating the generator G as the enhancement mapping, i.e.x = G(z,x) where z is a latent variable. The discriminator D is tasked to distinguish the enhanced outputx from the real clean signal x.…”
Section: Self-attention Seganmentioning
confidence: 99%
“…1. Various losses have been proposed to improve adversarial training, such as least-squares loss [13,23], Wasserstein loss [14], relativistic loss [16], and metric loss, [14]. Here, we employ the least-squares loss as in the seminal work [13].…”
Section: Self-attention Seganmentioning
confidence: 99%
See 3 more Smart Citations