2023
DOI: 10.1109/access.2023.3324210
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Attention Bottleneck for Gated Convolutional Encoder-Decoder-Based Speech Enhancement

Nasir Saleem,
Teddy Surya Gunawan,
Muhammad Shafi
et al.

Abstract: Convolutional encoder-decoder (CED) has emerged as a powerful architecture, particularly in speech enhancement (SE), which aims to improve the intelligibility and quality and intelligibility of noise-contaminated speech. This architecture leverages the strength of the convolutional neural networks (CNNs) in capturing high-level features. Usually, the CED architectures use the gated recurrent unit (GRU) or long-short-term memory (LSTM) as a bottleneck to capture temporal dependencies, enabling a SE model to eff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 70 publications
0
2
0
Order By: Relevance
“…The model performance evaluated in terms of perceptual evaluation of speech quality (PESQ) [35] and short-term objective intelligibility (STOI) [36]. The performance of the TFANUNet model is compared against following baselines: CRN [11] , TCNN [37], DCCRN [38], CS-CRN [39], DeepxiMMSE [40], MASENet [25], SADNUNet [17], and MAB-CED [26].…”
Section: Experimental Results Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…The model performance evaluated in terms of perceptual evaluation of speech quality (PESQ) [35] and short-term objective intelligibility (STOI) [36]. The performance of the TFANUNet model is compared against following baselines: CRN [11] , TCNN [37], DCCRN [38], CS-CRN [39], DeepxiMMSE [40], MASENet [25], SADNUNet [17], and MAB-CED [26].…”
Section: Experimental Results Analysismentioning
confidence: 99%
“…In the MAB-CED [26] and U-Transformer + FAT models [27], the TFA method is employed. This method consists of two main components: time-dimension attention and frequency-dimension attention, which work together to generate 1-dimensional attention maps.…”
Section: Introductionmentioning
confidence: 99%