In this paper a new iterative method of speech enhancement using Power Spectral Density (PSD) codebooks of clean speech and several types of noise, is proposed. The proposed algorithm estimates the PSDs of speech and noise of unknown nature and, evaluates the input Signal-to-Noise Ratio (SNR) by solving an over-determined set of equations. No Voice Activity Detection (V AD) or other means of noise spectral estimation such as minimum statistics is used. The pre-calculated codebooks are tree structured for the sake of speed of processing. The Wiener filter is used in the first instance because of its simplicity. A new variant of Parametric Wiener filter whose parameters are controlled by the skewness and kurtosis of the estimated clean speech and noise is also used to further suppress the noise. The results of employing these iterative algorithms are reported and compared for enhancement of noisy speech of different noise types and different input SNRs. Keywords-iterative and parametric Wiener filters, PSD codebook, tree-structured code book, noise estimation, skewness and kurtosis
I. I NTRODUCTIONIn real environments, the presence of interfering noises always greatly degrades the performance of speech communication systems. Some techniques have been developed to solve the problem over the past decades including, for instance, spectral subtraction, Wiener filtering and all-pole modelling non-causal Wiener filtering [1]. Most of these techniques are mainly under the assumption that the interfering signal is stationary, additive and non speech-like. Since the needed statistics of the noise can only be estimated during speech pauses a V AD is needed in the single-channel approaches where the noisy observation is only available. Alternatively noise estimation based on minimum statistics can be used. However, a poor performance is achieved when interference is time-varying and also speech-like.Iterative speech enhancement algorithms perform better at the cost of an increase in complexity. In [2], Lim and Oppenheim proposed the iterative Wiener filtering (lWF) technique for speech enhancement where the estimation of the all-pole parameters of speech in additive white Gaussian noise was posed as a two-step sequential Maximum A-Posteriori
(MAP) estimation problem. In [3], Hansen and Clementsshowed that constraints in the parameter estimation are essential in order to retain speech-like characteristics of enhanced speech. In [4], a clustering based approach namely the codebook constrained iterative Wiener filtering scheme was proposed as an alternative method of imposing constraints. Here, the all-pole parameters are constrained to belong to a codebook of clean speech vectors. Apart from successfully defining a convergence criterion, this approach was quite effective in taking care of several types of speech constraints such as those between the formants and those due to speaker variability.In all the above approaches only stationary noise is considered. However, in many practical applications the noise is time-varying and ...