2020 28th European Signal Processing Conference (EUSIPCO) 2021
DOI: 10.23919/eusipco47968.2020.9287606
|View full text |Cite
|
Sign up to set email alerts
|

AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 23 publications
0
12
0
Order By: Relevance
“…(3) AeGAN [26] is an improved architecture which extends FSEGAN by introducing an additional non-adversarial feature-based loss function together with an enhanced generator consisting of an end-to-end concatenation of three U-nets (CasNet), as shown in Fig. 1c.…”
Section: Time-frequency Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…(3) AeGAN [26] is an improved architecture which extends FSEGAN by introducing an additional non-adversarial feature-based loss function together with an enhanced generator consisting of an end-to-end concatenation of three U-nets (CasNet), as shown in Fig. 1c.…”
Section: Time-frequency Approachesmentioning
confidence: 99%
“…For instance, FSEGAN provided a two-dimensional adaptation of the SEGAN framework to suit TF inputs [22]. Building upon FSEGAN, additional adversarial approaches were introduced to provide further improvements in the resultant speech quality [23][24][25][26]. All of the aforementioned research necessitates the sole utilization of the magnitude component for SE while ignoring the phase.…”
Section: Introductionmentioning
confidence: 99%
“…In the generator, the STFT is represented by a 2-channel image, where the channels are the real and imaginary components. We also explored a polar representation, where the channels are the modulus and the phase; additionally we experimented with processing only the modulus channel and reusing the original phase, as is done in [30]. Nevertheless, we found the real/imaginary representation to perform better in our experiments.…”
Section: Stft Representationmentioning
confidence: 99%
“…Again, we observed no advantage in doing so, which suggests that the neural network is capable of internally handling the phase offsets. Unlike [30], we do not convert STFT to logarithmic scale as we found it be detrimental to performance (even with various smoothing and normalization schemes).…”
Section: Stft Representationmentioning
confidence: 99%
See 1 more Smart Citation