The combination of delayed sound from a digital hearing aid with direct sound through an open or vented fitting can potentially degrade the sound quality due to audible changes in timbre and/or perception of echo. The present study was designed to test a number of delay and high-pass combinations under worst-case (i.e. most sensitive) conditions. Eighteen normal-hearing and 18 mildly hearing-impaired subjects performed the test in a paired comparison (A/B) task. The subjects were asked to select a preferred setting with respect to sound quality. The test was set in an anechoic chamber using recorded speech, environmental sounds, and own voice. Experimental hearing aids were fitted binaurally with open domes thus providing maximum ventilation. The preference data were processed using a statistical choice model that derives a ratio-scale. The analysis indicated that in these test conditions there was no change in sound quality when varying the delay in the range 5-10 ms and that there was a preference for 2000 Hz high-pass filtering in most conditions, regardless of the hearing losses tested.
Hearing aid users are challenged in listening situations with noise and especially speech-on-speech situations with two or more competing voices. Specifically, the task of attending to and segregating two competing voices is particularly hard, unlike for normal-hearing listeners, as shown in a small sub-experiment. In the main experiment, the competing voices benefit of a deep neural network (DNN) based stream segregation enhancement algorithm was tested on hearing-impaired listeners. A mixture of two voices was separated using a DNN and presented to the two ears as individual streams and tested for word score. Compared to the unseparated mixture, there was a 13%-point benefit from the separation, while attending to both voices. If only one output was selected as in a traditional target-masker scenario, a larger benefit of 37%-points was found. The results agreed well with objective metrics and show that for hearing-impaired listeners, DNNs have a large potential for improving stream segregation and speech intelligibility in difficult scenarios with two equally important targets without any prior selection of a primary target stream. An even higher benefit can be obtained if the user can select the preferred target via remote control.
Mean square error (MSE) has been the preferred choice as loss function in the current deep neural network (DNN) based speech separation techniques. In this paper, we propose a new cost function with the aim of optimizing the extended short time objective intelligibility (ESTOI) measure. We focus on applications where low algorithmic latency (≤ 10 ms) is important. We use long short-term memory networks (LSTM) and evaluate our proposed approach on four sets of two-speaker mixtures from extended Danish hearing in noise (HINT) dataset. We show that the proposed loss function can offer improved or at par objective intelligibility (in terms of ESTOI) compared to an MSE optimized baseline while resulting in lower objective separation performance (in terms of the source to distortion ratio (SDR)). We then proceed to propose an approach where the network is first initialized with weights optimized for MSE criterion and then trained with the proposed ESTOI loss criterion. This approach mitigates some of the losses in objective separation performance while preserving the gains in objective intelligibility.
Old, hearing-impaired listeners generally benefit little from lateral separation of multiple talkers when listening to one of them. This study aimed to determine how spatial release from masking (SRM) in such listeners is affected when the interaural time differences (ITDs) in the temporal fine structure (TFS) are manipulated by tone-vocoding (TVC) at the ears by a master hearing aid system. Word recall was compared, with and without TVC, when target and masker sentences from a closed set were played simultaneously from the front loudspeaker (co-located) and when the maskers were played 45° to the left and right of the listener (separated). For 20 hearing-impaired listeners aged 64 to 86, SRM was 3.7 dB smaller with TVC than without TVC. This difference in SRM correlated with mean audiometric thresholds below 1.5 kHz, even when monaural TFS sensitivity (discrimination of frequency-shifts in identically filtered complexes) was partialed out, suggesting that low-frequency audiometric thresholds may be a good indicator of candidacy for hearing aids that preserve ITDs. The TVC difference in SRM was not correlated with age, pure-tone ITD thresholds, nor fundamental frequency difference limens, and only with monaural TFS sensitivity before control for low-frequency audiometric thresholds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.