In the real world, speech is usually distorted by both reverberation and background noise. In such conditions, speech intelligibility is degraded substantially, especially for hearing-impaired (HI) listeners. As a consequence, it is essential to enhance speech in the noisy and reverberant environment. Recently, deep neural networks have been introduced to learn a spectral mapping to enhance corrupted speech, and shown significant improvements in objective metrics and automatic speech recognition score. However, listening tests have not yet shown any speech intelligibility benefit. In this paper, we propose to enhance the noisy and reverberant speech by learning a mapping to reverberant target speech rather than anechoic target speech. A preliminary listening test was conducted, and the results show that the proposed algorithm is able to improve speech intelligibility of HI listeners in some conditions. Moreover, we develop a masking-based method for denoising and compare it with the spectral mapping method. Evaluation results show that the maskingbased method outperforms the mapping-based method.Index Terms-speech intelligibility test, speech denoising, spectral mapping, ideal ratio mask, deep neural networks
Binaural beamforming algorithms for head-mounted assistive listening devices are crucial to improve speech quality and speech intelligibility in noisy environments, while maintaining the spatial impression of the acoustic scene. While the well-known BMVDR beamformer is able to preserve the binaural cues of one desired source, the BLCMV beamformer uses additional constraints to also preserve the binaural cues of interfering sources. In this paper, we provide theoretical and practical insights on how to optimally set the interference scaling parameters in the BLCMV beamformer for an arbitrary number of interfering sources. In addition, since in practice only a limited temporal observation interval is available to estimate all required beamformer quantities, we provide an experimental evaluation in a complex acoustic scenario using measured impulse responses from hearing aids in a cafeteria for different observation intervals. The results show that even rather short observation intervals are sufficient to achieve a decent noise reduction performance and that a proposed threshold on the optimal interference scaling parameters leads to smaller binaural cue errors in practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.