Adaptive noise estimation algorithm for speech enhancement

Lin, Lao-Sheng; Holmes, W.H.; Ambikairajah, Eliathamby

doi:10.1049/el:20030480

Cited by 42 publications

(18 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our MMSE solution is similar, in essence, to what was reported in [7] and [8] and is based on solving an overdetermined system of equations using GMMs of speech and different noise source candidates. In fact the mean vectors of power spectra models of noise and clean speech are formed as follows: ∑ ∑ (6) in which refers to the frequency bins in the FFT domain and varies from 0 to 256 in our case for each noise source candidate whose mixture model is available. As there are as many equations as frequency bins but only unknowns, the MMSE solution, provided by a standard algorithm, returns not only the parameters and (negative values are excluded and the remaining are normalized to attain the same energies of speech and noise signals involved) but also the minimum squared error between the actual and its estimate averaged over all frequencies .…”

Section: Mmse Solution Using Over-determined System Ofmentioning

confidence: 99%

“…However, in many practical applications the noise is time-varying and hence leads to sub-optimum results. Several techniques, found in the literature, address this problem [5]- [6]. Most of them concentrate on avoiding explicit speech/non-speech classification and resort to measures of recursively estimating the noise Power Spectral Density (PSD).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

MMSE speech enhancement based on GMM and solving an over-determined system of equations

Chehresa

Savoji

2011

2011 IEEE 7th International Symposium on Intelligent Signal Processing

View full text Add to dashboard Cite

A new and effective algorithm is proposed in this paper based on Gaussian Mixture Modelling (GMM) and Minimum Mean Square Error (MMSE) criterion for speech enhancement where no assumption is made on the nature or stationarity of the noise. No Voice Activity Detection (VAD) or any other means is used to estimate the input Signal to Noise Ratio (SNR). The mean vectors of the mixture models of spectral magnitudes derived from models of speech and different noise sources power spectra are used to form sets of over-determined system of equations, as many as noise source candidates, whose solutions lead to the MMSE estimations of speech and additive noise spectral magnitudes. The corresponding power spectra are then used for noise suppression by applying Wiener filtering carried out on overlapping frames. The input SNR is estimated and the nature of the noise involved is determined as by-products of the method used. Results are compared with codebook constrained methods that have shown very good results but suffer from long processing times. It is shown that, at the cost of a slight lower improvement in SNR and PESQ score, the new algorithm reduces the computation time to one fifth which makes it suitable for practical applications. (Abstract)

show abstract

Section: Mmse Solution Using Over-determined System Ofmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

MMSE speech enhancement based on GMM and solving an over-determined system of equations

Chehresa

Savoji

2011

2011 IEEE 7th International Symposium on Intelligent Signal Processing

View full text Add to dashboard Cite

show abstract

“…The value of VAS parameter obtained from VAD approach is used to calculate the subband noise power. The noise estimation for each subband is computed using the adaptive noise estimation algorithm proposed by Lin et al [24].…”

Section: Proposed Speech Enhancement Algorithmmentioning

confidence: 99%

An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment

Wang¹

2010

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYTraditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition (BSWPD) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.

show abstract

“…However, in many practical applications the noise is time-varying and hence leads to sub-optimum results. Several techniques, found in literature, address this problem [5]- [6]. Most of them concentrate on avoiding explicit speech/non-speech classification and resort to measures of recursively estimating the noise PSD.…”

Section: Introductionmentioning

confidence: 99%

Codebook constrained iterative and Parametric Wiener filter speech enhancement

Chehresa

Savoji

2009

2009 IEEE International Conference on Signal and Image Processing Applications

View full text Add to dashboard Cite

In this paper a new iterative method of speech enhancement using Power Spectral Density (PSD) codebooks of clean speech and several types of noise, is proposed. The proposed algorithm estimates the PSDs of speech and noise of unknown nature and, evaluates the input Signal-to-Noise Ratio (SNR) by solving an over-determined set of equations. No Voice Activity Detection (V AD) or other means of noise spectral estimation such as minimum statistics is used. The pre-calculated codebooks are tree structured for the sake of speed of processing. The Wiener filter is used in the first instance because of its simplicity. A new variant of Parametric Wiener filter whose parameters are controlled by the skewness and kurtosis of the estimated clean speech and noise is also used to further suppress the noise. The results of employing these iterative algorithms are reported and compared for enhancement of noisy speech of different noise types and different input SNRs. Keywords-iterative and parametric Wiener filters, PSD codebook, tree-structured code book, noise estimation, skewness and kurtosis I. I NTRODUCTIONIn real environments, the presence of interfering noises always greatly degrades the performance of speech communication systems. Some techniques have been developed to solve the problem over the past decades including, for instance, spectral subtraction, Wiener filtering and all-pole modelling non-causal Wiener filtering [1]. Most of these techniques are mainly under the assumption that the interfering signal is stationary, additive and non speech-like. Since the needed statistics of the noise can only be estimated during speech pauses a V AD is needed in the single-channel approaches where the noisy observation is only available. Alternatively noise estimation based on minimum statistics can be used. However, a poor performance is achieved when interference is time-varying and also speech-like.Iterative speech enhancement algorithms perform better at the cost of an increase in complexity. In [2], Lim and Oppenheim proposed the iterative Wiener filtering (lWF) technique for speech enhancement where the estimation of the all-pole parameters of speech in additive white Gaussian noise was posed as a two-step sequential Maximum A-Posteriori (MAP) estimation problem. In [3], Hansen and Clementsshowed that constraints in the parameter estimation are essential in order to retain speech-like characteristics of enhanced speech. In [4], a clustering based approach namely the codebook constrained iterative Wiener filtering scheme was proposed as an alternative method of imposing constraints. Here, the all-pole parameters are constrained to belong to a codebook of clean speech vectors. Apart from successfully defining a convergence criterion, this approach was quite effective in taking care of several types of speech constraints such as those between the formants and those due to speaker variability.In all the above approaches only stationary noise is considered. However, in many practical applications the noise is time-varying and ...

show abstract

Adaptive noise estimation algorithm for speech enhancement

Cited by 42 publications

References 7 publications

MMSE speech enhancement based on GMM and solving an over-determined system of equations

MMSE speech enhancement based on GMM and solving an over-determined system of equations

An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment

Codebook constrained iterative and Parametric Wiener filter speech enhancement

Contact Info

Product

Resources

About