2021 29th European Signal Processing Conference (EUSIPCO) 2021
DOI: 10.23919/eusipco54536.2021.9616114
|View full text |Cite
|
Sign up to set email alerts
|

Towards speech enhancement using a variational U-Net architecture

Abstract: In this paper, we investigate the viability of a variational U-Net architecture for denoising of single-channel audio data. Deep network speech enhancement systems commonly aim to estimate filter masks, or opt to skip preprocessing steps to directly work on the waveform signal, potentially neglecting relationships across higher dimensional spectro-temporal features. We study the adoption of a probabilistic bottleneck, as well as dilated convolutions, into the classic U-Net architecture. Evaluation of a number … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…As found in many audio enhancement applications the vanilla architecture usually does not generate an impressive audio quality (although comparable or slightly better than the machine learning algorithms e.g. [66], [67], [68], [69], [70], and [71]) in the output. Therefore either additional layers are added, skip connections are modified, hyper parameters are adjusted or completely new architectures are proposed in many recent AE models to improve the performance.…”
Section: Modified Architecturesmentioning
confidence: 91%
See 2 more Smart Citations
“…As found in many audio enhancement applications the vanilla architecture usually does not generate an impressive audio quality (although comparable or slightly better than the machine learning algorithms e.g. [66], [67], [68], [69], [70], and [71]) in the output. Therefore either additional layers are added, skip connections are modified, hyper parameters are adjusted or completely new architectures are proposed in many recent AE models to improve the performance.…”
Section: Modified Architecturesmentioning
confidence: 91%
“…In contrast to the deterministic characteristics of vanilla U-Net, the VAE in the bottleneck offers increased robustness towards out-of-distribution effects, such as reverberation and unknown noise types [67].…”
Section: Variational Auto-encoder (Vae)mentioning
confidence: 99%
See 1 more Smart Citation
“…The U-Network structure is similar to an autoencoder, yet the use of probabilistic latent space models similar to variational autoencoders, was previously only used for image segmentation tasks [15]. In our previous work [16], we showed that including a probabilistic latent space model in a U-Network increases its generalization ability by introducing noise to the latent space, thereby indirectly increasing the processed feature variety of the network during training. Here, we extend our previous work to model the latent space of a complex-valued U-Network by introducing two separate latent spaces for real and imaginary part, respectively.…”
mentioning
confidence: 99%