2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
DOI: 10.1109/waspaa.2019.8937186
|View full text |Cite
|
Sign up to set email alerts
|

Attention Wave-U-Net for Speech Enhancement

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 90 publications
(43 citation statements)
references
References 17 publications
0
43
0
Order By: Relevance
“…Interpolation + convolution was proposed as an alternative to avoid transposed convolution artifacts [15]. It has been used, e.g., for music source separation [2], speech enhancement [22] and neural vocoding [14]. While interpolation (e.g., linear or nearest neighbor) is effectively upsampling the signal, the subsequent convolution further transforms the upsampled signal with learnable weights.…”
Section: Intepolation Upsamplersmentioning
confidence: 99%
“…Interpolation + convolution was proposed as an alternative to avoid transposed convolution artifacts [15]. It has been used, e.g., for music source separation [2], speech enhancement [22] and neural vocoding [14]. While interpolation (e.g., linear or nearest neighbor) is effectively upsampling the signal, the subsequent convolution further transforms the upsampled signal with learnable weights.…”
Section: Intepolation Upsamplersmentioning
confidence: 99%
“…In the joint training process, an attention mechanism is added between the input in the SED module and the output passing through each convolutional layer and maxpooling layer. We exploit an attention mechanism of a similar form used in [39,40], but with different stride sizes of each convolutional layer, normalization, and skip-connection. As shown in Figure 3, i is set as a variable representing the order of blocks (convolutional and max-pooling layers) to which the attention module is added.…”
Section: Attention Mechanismmentioning
confidence: 99%
“…In addition, the far-end information, which helps to subtract an acoustic echo, is extracted by passing through the auxiliary encoder, and it is transmitted as compressed latent features of the mixtures through an element-wise multiplication. To build a more effective connection, an attention mechanism [ 25 ] is applied to enable an efficient information delivery, as described in detail in Figure 3 . The latent features of the far-end and compressed multi-channel mixture are mapped to the intermediate feature space.…”
Section: Proposed Multi-channel Cross-tower With Attention Mechanimentioning
confidence: 99%