2022
DOI: 10.3390/computation10060102
|View full text |Cite
|
Sign up to set email alerts
|

An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

Abstract: The purpose of speech enhancement is to improve the quality of speech signals degraded by noise, reverberation, or other artifacts that can affect the intelligibility, automatic recognition, or other attributes involved in speech technologies and telecommunications, among others. In such applications, it is essential to provide methods to enhance the signals to allow the understanding of the messages or adequate processing of the speech. For this purpose, during the past few decades, several techniques have be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 43 publications
0
1
0
Order By: Relevance
“…Subsequently, these coefficients were utilized for image reconstruction and the extraction of denoised images post 2D inverse WT (IDWT) operations. The quality of denoised images were assessed by comparing them with the original noisy ones using a metric called peak signal-to-noise ratio (PSNR) [21], [23], [24].…”
Section: Wavelet Transformmentioning
confidence: 99%
“…Subsequently, these coefficients were utilized for image reconstruction and the extraction of denoised images post 2D inverse WT (IDWT) operations. The quality of denoised images were assessed by comparing them with the original noisy ones using a metric called peak signal-to-noise ratio (PSNR) [21], [23], [24].…”
Section: Wavelet Transformmentioning
confidence: 99%
“…Sound recognition plays an important role in most of the encountered audio and audiovisual pattern analysis cases, where related content is massively produced and uploaded (i.e., digital audio broadcasting, podcasts and web radio, but also video on demand (VoD), web-TV and multimodal UGC sharing in general). Specifically, there are various pattern recognition and semantic analysis tasks in the audio domain, including speech-music segmentation [8], genre recognition [10], speaker verification and voice diarization [11], speech enhancement [12,13], sound event detection [14], phoneme and speech recognition [15][16][17], as well as topic/story classification [18][19][20], sentiment analysis and opinion extraction [4,5,21], multiclass audio discrimination [22], environmental sound classification [23] and biomedical audio processing [24]. Audio broadcast is generally considered to be one of the most demanding recognition cases, where a large diversity of content types with many detection difficulties are implicated [1].…”
Section: Introductionmentioning
confidence: 99%