This paper presents a hybrid approach for single channel speech enhancement using deep neural network (DNN) and harmonic regeneration noise reduction (HRNR). The DNN was used as a supervised algorithm to predict new target mask such as constrained Wiener Filter (cWF) target mask from noisy mixture signal that was transformed into gammatone filter bank features. Meanwhile, HRNR algorithm was applied in the postfiltering strategy to eliminate residual noise. The DNN algorithm is an emerging supervised speech enhancement to overcome heavy nonstationary noise and low signal-to-noise ratio (SNR) issues. To validate the proposed algorithm with new target mask, 600 Malay utterances combining male and female speakers were used in a training session while 120 Malay utterances were used in a prediction session. The short time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ) scores were calculated as the performance metrics. In this work, the proposed target mask outperformed other baseline target masks. Thus, PESQ and STOI scores for the hybrid speech enhancement algorithm is 1.17 and 0.79, respectively, at -5 dB babble noise SNR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.