The main aim of this paper is to introduce a new approach to enhance speech signals by exploring the advantages of nonlocal means (NLM) estimation and empirical mode decomposition. NLM, a patch-based denoising method, is extensively used for two-dimensional signals like images. However, its use for one-dimensional signals has been attracting more attention recently. The NLM-based approach is quite useful for removing low-frequency noises based on nonlocal similarities present among samples of the signal. However, there is an issue of under averaging in the high-frequency regions. The temporal and spectral characteristics of the speech signal are changing markedly over time. Thus NLM is conventionally not effective to remove the noise components from the speech signal, unlike image denoising. To address this issue, initially, the speech signal is decomposed into oscillatory components called intrinsic mode functions (IMFs) by using a temporal decomposition technique known as the sifting process. Each IMF represents signal information at a certain scale or frequency band. The IMFs do not have abrupt power spectral changes over time. The decomposed IMFs are processed using NLM estimation based on nonlocal similarities for better speech enhancement. The simulation result shows that the proposed method gives better performance in terms of subjective and objective quality measures. Its performance is evaluated for white, factory, and babble noises at different signal to noise ratios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.