2022
DOI: 10.1109/lsp.2021.3128374
|View full text |Cite
|
Sign up to set email alerts
|

A Nested U-Net With Self-Attention and Dense Connectivity for Monaural Speech Enhancement

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 41 publications
(6 citation statements)
references
References 43 publications
0
2
0
Order By: Relevance
“…The model performance evaluated in terms of perceptual evaluation of speech quality (PESQ) [35] and short-term objective intelligibility (STOI) [36]. The performance of the TFANUNet model is compared against following baselines: CRN [11] , TCNN [37], DCCRN [38], CS-CRN [39], DeepxiMMSE [40], MASENet [25], SADNUNet [17], and MAB-CED [26].…”
Section: Experimental Results Analysismentioning
confidence: 99%
See 2 more Smart Citations
“…The model performance evaluated in terms of perceptual evaluation of speech quality (PESQ) [35] and short-term objective intelligibility (STOI) [36]. The performance of the TFANUNet model is compared against following baselines: CRN [11] , TCNN [37], DCCRN [38], CS-CRN [39], DeepxiMMSE [40], MASENet [25], SADNUNet [17], and MAB-CED [26].…”
Section: Experimental Results Analysismentioning
confidence: 99%
“…Using such dense connectivity allows information to flow at maximum speed and be in-depth with maximum depth, while minimising the complexity of the model by effectively reusing intermediate representations between layers. In [16][17][18][19][20], the authors proposed a DenseNet to reduce the number of extended dilated convolutional layers and cover the large receptive area. In DenseNet, the features of the early and later layers are directly merged into a single convolutional layer via dense skip connectivity.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Various models based on U-Net have been studied, including the Nested U-Net [41] and Wave-U-Net [42], and studies on speech enhancement using these modified models have been conducted. Xiang et al applied self-attention to the Nested U-Net structure, which contains skip connections at each stage of the encoder and decoder, to perform speech enhancement using contextual information at various scales [43]. Craig et al used Wave-U-Net, which applies U-Net to a one-dimensional time domain, to separate vocals and accompaniments in music for speech enhancement [44].…”
Section: B Speech Enhancementmentioning
confidence: 99%
“…Previous studies have demonstrated that real-valued DNNs offer numerous advantages in acoustic echo cancellation, including efficient acoustic echo cancellation, strong adaptability, high-quality output, excellent real-time performance, and robust scalability. It ensures clear and natural speech signals, making it suitable for a wide range of real-time communication and speech processing applications [29,30]. Building on the works presented in [15], we incorporate the PE module into our model to facilitate the conversion of the complex spectra to a real spectrum.…”
Section: Phase Encoder and Time-frequency Convolution Modulementioning
confidence: 99%