2021 29th European Signal Processing Conference (EUSIPCO) 2021
DOI: 10.23919/eusipco54536.2021.9616267
|View full text |Cite
|
Sign up to set email alerts
|

Investigating Cross-Domain Losses for Speech Enhancement

Abstract: Recent years have seen a surge in the number of available frameworks for speech enhancement (SE) and recognition. Whether model-based or constructed via deep learning, these frameworks often rely in isolation on either time-domain signals or time-frequency (TF) representations of speech data. In this study, we investigate the advantages of each set of approaches by separately examining their impact on speech intelligibility and quality. Furthermore, we combine the fragmented benefits of time-domain and TF spee… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 46 publications
0
2
0
Order By: Relevance
“…Here we normalize the PESQ score to the range [0,1]. Moreover, an additional penalization in the resultant waveform L Time is proven to improve the restored speech quality [20]:…”
Section: Loss Functionmentioning
confidence: 99%
See 1 more Smart Citation
“…Here we normalize the PESQ score to the range [0,1]. Moreover, an additional penalization in the resultant waveform L Time is proven to improve the restored speech quality [20]:…”
Section: Loss Functionmentioning
confidence: 99%
“…In the TF-domain, most conventional model-based or DL techniques utilize the magnitude component while ignoring the phase. This is accounted to the unstructured phase component, which imposes challenges to the utilized architectures [19], [20]. To circumvent this challenge, several approaches follow the strategy of enhancing the complex spectrogram (real and imaginary parts), which implicitly enhances both magnitude and phase [21], [22].…”
Section: Introductionmentioning
confidence: 99%