2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2018
DOI: 10.1109/iscslp.2018.8706647
|View full text |Cite
|
Sign up to set email alerts
|

Speech Enhancement Based on A New Architecture of Wasserstein Generative Adversarial Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 17 publications
0
4
0
Order By: Relevance
“…Recently, GAN models are shown to boost the generalization performance, and improve the quality of enhanced speech in the T-F domain [14,15], as shown in the speech enhancement generative adversarial network (SEGAN) where conditional GAN is used for speech enhancement. Although SEGAN achieves good performance measured in subjective metrics, the performance measured via objective metrics such as signal-tonoise ratio (SNR) tends to be degraded, which is caused by the vanishing gradient problem during training with the conditional GAN loss [14].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, GAN models are shown to boost the generalization performance, and improve the quality of enhanced speech in the T-F domain [14,15], as shown in the speech enhancement generative adversarial network (SEGAN) where conditional GAN is used for speech enhancement. Although SEGAN achieves good performance measured in subjective metrics, the performance measured via objective metrics such as signal-tonoise ratio (SNR) tends to be degraded, which is caused by the vanishing gradient problem during training with the conditional GAN loss [14].…”
Section: Introductionmentioning
confidence: 99%
“…To address this problem, the Wasserstein distance [16][17][18] has been introduced to improve the conditional GAN loss, resulting in the Wasserstein GAN (WGAN) method that achieves better objective performance than SEGAN [15,19]. The WGAN method is further improved in [20] by employing metric evaluation in the conditional GAN loss, and leads to the Metric GAN method, which outperforms WGAN based methods for speech enhancement.…”
Section: Introductionmentioning
confidence: 99%
“…Generative SE models have been accompanied by a discriminator whose task is to distinguish the original clean samples from enhanced samples. Not only does this improve the perceptual quality and intelligibility of the samples generated from the encoder-decoder generator, the addition of an adversarial model further compensates the distorted clean distributions in a generative adversarial network (GAN) SE system [14,15,16,17]. Therefore, high mean opinion scores on subjective tests can be achieved by providing more realistic and pleasant enhanced speech signals to listeners.…”
Section: Introductionmentioning
confidence: 99%
“…SEGAN [15] is the first approach to apply GAN to SE task, which models a mapping between clean waveform and noisy waveform in an end-to-end way. Because of the unstable training process, other GAN-based systems utilize the other objective function to stabilize the training process, such as WGAN [18], SERGAN [19]. All of these GAN-based SE systems apply the U-Net architecture in the generator network from image-to-image translation [20] directly.…”
Section: Introductionmentioning
confidence: 99%