Weighted autocorrelation for pitch extraction of noisy speech

Shimamura, Tetsuya; Kobayashi, Hayato

doi:10.1109/89.952490

Cited by 192 publications

(120 citation statements)

References 7 publications

Supporting

Mentioning

118

Contrasting

Unclassified

Order By: Relevance

“…Hasan proposed signal reshaping technique [13] for emphasizing the true peak. Shimamura proposed weighted the ACF [14] by the inverse average magnitude difference function [9].…”

Section: Problem Descriptionmentioning

confidence: 99%

Correlation Based Fundamental Frequency Extraction Method in Noisy Speech Signal

Hasan¹

2017

IJCSEIT

View full text Add to dashboard Cite

show abstract

“…Hasan proposed signal reshaping technique [13] for emphasizing the true peak. Shimamura proposed weighted the ACF [14] by the inverse average magnitude difference function [9].…”

Section: Problem Descriptionmentioning

confidence: 99%

Correlation Based Fundamental Frequency Extraction Method in Noisy Speech Signal

Hasan¹

2017

IJCSEIT

View full text Add to dashboard Cite

show abstract

“…It is beneficial to improve the accuracy in estimating the pitch period. A weighted autocorrelation function (WAC) is then computed to improve the discriminability at the pitch position, given as [12] …”

Section: Detection Of Vowel Framesmentioning

confidence: 99%

Estimation of Noise Magnitude for Speech Denoising Using Minima-Controlled-Recursive-Averaging Algorithm Adapted by Harmonic Properties

Lei

Shen

et al. 2016

Applied Sciences

View full text Add to dashboard Cite

Abstract:The accuracy of noise estimation is important for the performance of a speech denoising system. Most noise estimators suffer from either overestimation or underestimation on the noise level. An overestimate on noise magnitude will cause serious speech distortion for speech denoising. Conversely, a great quantity of residual noise will occur when the noise magnitude is underestimated. Accurately estimating noise magnitude is important for speech denoising. This study proposes employing variable segment length for noise tracking and variable thresholds for the determination of speech presence probability, resulting in the performance improvement for a minima-controlled-recursive-averaging (MCRA) algorithm in noise estimation. Initially, the fundamental frequency was estimated to determine whether a frame is a vowel. In the case of a vowel frame, the increment of segment lengths and the decrement of threshold for speech presence were performed which resulted in underestimating the level of noise magnitude. Accordingly, the speech distortion is reduced in denoised speech. On the contrary, the segment length decreases rapidly in noise-dominant regions. This enables the noise estimate to update quickly and the noise variation to track well, yielding interference noise being removed effectively through the process of speech denoising. Experimental results show that the proposed approach has been effective in improving the performance of the MCRA algorithm by preserving the weak vowels and consonants. The denoising performance is therefore improved.

show abstract

“…There are many pitch estimation algorithms available now-a-days. Different algorithms have been implemented in the time domain [48,49] but none of them meets the desired performance of pitch estimation. The pitch estimation is also performed in the transformed domain.…”

Section: Pitch Estimationmentioning

confidence: 99%

Empirical Mode Decomposition for Advanced Speech Signal Processing

Molla

Das

Hamid

et al. 2013

Journal of Signal Processing

View full text Add to dashboard Cite

Empirical mode decomposition (EMD) is a newly developed tool to analyze nonlinear and non-stationary signals.It is used to decompose any signal into a finite number of time varying subband signals termed as intrinsic mode functions (IMFs). Such data adaptive decomposition is recently used in speech enhancement. This study presents the concept of EMD and its application to advanced speech signal processing paradigms including speech enhancement by soft-thresholding, voiced/unvoiced (V/Uv) speech discrimination and pitch estimation. The speech processing is frequently performed in the transformed domain and the transformation is usually achieved by traditional signal analysis techniques i.e. Fourier and wavelet transformations. These analysis methods employ priori basis function and it is not suitable for data adaptive analysis for non-stationary signal like speech. Recently, EMD is taken much attention for speech signal processing in data adaptive way. Several EMD based potential soft-thresholding algorithms for speech enhancement are discussed here. The V/Uv discrimination is an important concern in speech processing. It is usually performed by using acoustic features. The training data is used to determine the threshold for classification. The EMD based data adaptive thresholding approach is developed for V/Uv discrimination without any training phase. Noticeable improvement is achieved with the application of EMD in pitch estimation of noisy speech signals. The related experimental results are also presented to realize the effectiveness of EMD in advanced speech processing algorithms.

show abstract

Weighted autocorrelation for pitch extraction of noisy speech

Cited by 192 publications

References 7 publications

Correlation Based Fundamental Frequency Extraction Method in Noisy Speech Signal

Correlation Based Fundamental Frequency Extraction Method in Noisy Speech Signal

Estimation of Noise Magnitude for Speech Denoising Using Minima-Controlled-Recursive-Averaging Algorithm Adapted by Harmonic Properties

Empirical Mode Decomposition for Advanced Speech Signal Processing

Contact Info

Product

Resources

About