ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9053934
|View full text |Cite
|
Sign up to set email alerts
|

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

Abstract: We propose a time-domain audio source separation method using down-sampling (DS) and up-sampling (US) layers based on a discrete wavelet transform (DWT). The proposed method is based on one of the state-of-the-art deep neural networks, Wave-U-Net, which successively down-samples and up-samples feature maps. We find that this architecture resembles that of multiresolution analysis, and reveal that the DS layers of Wave-U-Net cause aliasing and may discard information useful for the separation. Although the effe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 19 publications
(16 citation statements)
references
References 19 publications
(19 reference statements)
0
14
0
Order By: Relevance
“…Hence, no additive (tonal) or substractive (filtering) artifacts emerge after downsampling/upsampling with wavelets. However, the perfect reconstruction property only holds in absence of any processing in the wavelet domain-and we describe waveletinspired downsampling/upsampling layers that, combined with neural networks, have the capacity to perform downstream tasks [8,16]: -Lazy wavelet layers downsample the signal into odd/even samples, and interleave odd/even samples for upsampling. -Haar wavelet layers can be described following the filter bank scheme in Fig.…”
Section: Wavelet-based Upsamplers and Filtering Artifactsmentioning
confidence: 99%
See 3 more Smart Citations
“…Hence, no additive (tonal) or substractive (filtering) artifacts emerge after downsampling/upsampling with wavelets. However, the perfect reconstruction property only holds in absence of any processing in the wavelet domain-and we describe waveletinspired downsampling/upsampling layers that, combined with neural networks, have the capacity to perform downstream tasks [8,16]: -Lazy wavelet layers downsample the signal into odd/even samples, and interleave odd/even samples for upsampling. -Haar wavelet layers can be described following the filter bank scheme in Fig.…”
Section: Wavelet-based Upsamplers and Filtering Artifactsmentioning
confidence: 99%
“…Further, wavelets provide a principled way to downsample. Inspired by that, Nakamura and Saruwatari [8] replaced WaveUnet's [2] downsampling (discarding every other time step) and upsampling (linear interpolation) layers by lazy and Haar wavelet layers. However, the above wavelet layers are designed to downsample/upsample x2, as the original WaveUnet.…”
Section: Wavelet-based Upsamplers and Filtering Artifactsmentioning
confidence: 99%
See 2 more Smart Citations
“…Note that this paper is partially based on our international conference papers [24], [25]. This paper has the following seven additional contributions: (i) Although the DWT layer we presented in the conference paper is designed only for predetermined wavelets, we extend it to those jointly trainable with the other DNN modules.…”
Section: Introductionmentioning
confidence: 99%