This paper presents a novel data adaptive thresholding approach to single channel speech enhancement. The noisy speech signal and fractional Gaussian noise (fGn) are combined to produce the complex signal. The fGn is generated using the noise variance roughly estimated from the noisy speech signal. Bivariate empirical mode decomposition (bEMD) is employed to decompose the complex signal into a finite number of complex-valued intrinsic mode functions (IMFs). The real and imaginary parts of the IMFs represent the IMFs of observed speech and fGn, respectively. Each IMF is divided into short time frames for local processing. The variance of IMF of fGn calculated within a frame is used as the reference term to classify corresponding noisy speech frame into noise and signal dominant frames. Only the noise dominant frames are soft-thresholded to reduce the noise effects. Then, all the frames as well as IMFs of speech are combined, yielding the enhanced speech signal. The experimental results show the improved performance of the proposed algorithm compared to the recently reported methods.
Empirical mode decomposition (EMD) is a newly developed tool to analyze nonlinear and non-stationary signals.It is used to decompose any signal into a finite number of time varying subband signals termed as intrinsic mode functions (IMFs). Such data adaptive decomposition is recently used in speech enhancement. This study presents the concept of EMD and its application to advanced speech signal processing paradigms including speech enhancement by soft-thresholding, voiced/unvoiced (V/Uv) speech discrimination and pitch estimation. The speech processing is frequently performed in the transformed domain and the transformation is usually achieved by traditional signal analysis techniques i.e. Fourier and wavelet transformations. These analysis methods employ priori basis function and it is not suitable for data adaptive analysis for non-stationary signal like speech. Recently, EMD is taken much attention for speech signal processing in data adaptive way. Several EMD based potential soft-thresholding algorithms for speech enhancement are discussed here. The V/Uv discrimination is an important concern in speech processing. It is usually performed by using acoustic features. The training data is used to determine the threshold for classification. The EMD based data adaptive thresholding approach is developed for V/Uv discrimination without any training phase. Noticeable improvement is achieved with the application of EMD in pitch estimation of noisy speech signals. The related experimental results are also presented to realize the effectiveness of EMD in advanced speech processing algorithms.
Abstract:A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based softthresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.