2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009
DOI: 10.1109/icassp.2009.4960505
|View full text |Cite
|
Sign up to set email alerts
|

Combining frontend-based memory with MFCC features for Bandwidth Extension of narrowband speech

Abstract: In this paper, we continue our previous work on improving Bandwidth Extension (BWE) of narrowband speech. We have shown that including memory into the parametrization frontend (through delta features) results in higher highband certainty irrespective of feature type, with MFCCs exhibiting higher correlation, in general, between both bands, reaching twice that using LSFs. By incorporating memory into the frontend of a conventional LP-based BWE system, we were able to translate the higher correlation due to memo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2011
2011
2016
2016

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 10 publications
0
12
0
Order By: Relevance
“…The performance of the GMM based spectral envelope extension was then enhanced by using Mel frequency cepstral coefficients (MFCCs) instead of LPC coefficients [11]. The GMM mapping with memory further results in better performance in terms of log spectral distortion (LSD) and perceptual evaluation of speech quality (PESQ) [12].The advantage of the GMM in envelope extension methods is that they offer a continuous approximation from NB to WB features compared to the discrete acoustic space resulting from vector quantization (VQ). Better results were reported for GMM-based methods compared to codebook mapping in [10] in terms of SD, cepstral distance, and a paired subjective comparison.…”
Section: Gaussian Mixture Modelmentioning
confidence: 99%
“…The performance of the GMM based spectral envelope extension was then enhanced by using Mel frequency cepstral coefficients (MFCCs) instead of LPC coefficients [11]. The GMM mapping with memory further results in better performance in terms of log spectral distortion (LSD) and perceptual evaluation of speech quality (PESQ) [12].The advantage of the GMM in envelope extension methods is that they offer a continuous approximation from NB to WB features compared to the discrete acoustic space resulting from vector quantization (VQ). Better results were reported for GMM-based methods compared to codebook mapping in [10] in terms of SD, cepstral distance, and a paired subjective comparison.…”
Section: Gaussian Mixture Modelmentioning
confidence: 99%
“…Several methods including codebook mapping, neural networks, and Gaussian mixtures models (GMM) have been proposed for estimating the highband parameters. GMM methods have been used, e.g., with spectral vectors [2], mel-frequency cepstral coefficients (MFCC) [3], and line spectral frequencies (LSF) [4]. Further approaches utilizing GMM in ABE include adjusting the temporal envelope and gain of the highband sub-bands based on GMM-estimated parameters [5] and using GMM for both highband prediction and denoising of lowband features [6].…”
Section: Introductionmentioning
confidence: 99%
“…Thus, we further introduced the cosh measure to evaluate the performance of the reconstructed audio signals and could obtain more subjectively correlated results [20]. The cosh measure is defined as,…”
Section: Cosh Measurementioning
confidence: 99%
“…Audio signals generated using the proposed TSCC-based method, MFCC-based method, G.729.1 at 32 kbit/s, and G.729.1 Annex E at 36 kbit/s were objectively evaluated in terms of the log spectral distortion (LSD) [35,36], cosh measure [20,35], and differential log spectral distortion (DLSD) [37] in comparison with the original SWB audio.…”
Section: Objective Evaluationmentioning
confidence: 99%
See 1 more Smart Citation