This study presents revision, extension, and evaluation of a binaural speech intelligibility model (Beutelmann, R., and Brand, T. (2006). J. Acoust. Soc. Am. 120, 331-342) that yields accurate predictions of speech reception thresholds (SRTs) in the presence of a stationary noise source at arbitrary azimuths and in different rooms. The modified model is based on an analytical expression of binaural unmasking for arbitrary input signals and is computationally more efficient, while maintaining the prediction quality of the original model. An extension for nonstationary interferers was realized by applying the model to short time frames of the input signals and averaging over the predicted SRT results. Binaural SRTs from 8 normal-hearing and 12 hearing-impaired subjects, incorporating all combinations of four rooms, three source setups, and three noise types were measured and compared to the model's predictions. Depending on the noise type, the parametric correlation coefficients between observed and predicted SRTs were 0.80-0.93 for normal-hearing subjects and 0.59-0.80 for hearing-impaired subjects. The mean absolute prediction error was 3 dB for the mean normal-hearing data and 4 dB for the individual hearing-impaired data. 70% of the variance of the SRTs of hearing-impaired subjects could be explained by the model, which is based only on the audiogram.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.
Auditory models have been developed for decades to simulate characteristics of the human auditory system, but it is often unknown how well auditory models compare to each other or perform in tasks they were not primarily designed for. This study systematically analyzes predictions of seven publicly-available cochlear filter models in response to a fixed set of stimuli to assess their capabilities of reproducing key aspects of human cochlear mechanics. The following features were assessed at frequencies of 0.5, 1, 2, 4, and 8 kHz: cochlear excitation patterns, nonlinear response growth, frequency selectivity, group delays, signal-in-noise processing, and amplitude modulation representation. For each task, the simulations were compared to available physiological data recorded in guinea pigs and gerbils as well as to human psychoacoustics data. The presented results provide application-oriented users with comprehensive information on the advantages, limitations and computation costs of these seven mainstream cochlear filter models.
The amount of masking of sounds from one source (signals) by sounds from a competing source (maskers) heavily depends on the sound characteristics of the masker and the signal and on their relative spatial location. Numerous studies investigated the ability to detect a signal in a speech or a noise masker or the effect of spatial separation of signal and masker on the amount of masking, but there is a lack of studies investigating the combined effects of many cues on the masking as is typical for natural listening situations. The current study using free-field listening systematically evaluates the combined effects of harmonicity and inharmonicity cues in multi-tone maskers and cues resulting from spatial separation of target signal and masker on the detection of a pure tone in a multi-tone or a noise masker. A linear binaural processing model was implemented to predict the masked thresholds in order to estimate whether the observed thresholds can be accounted for by energetic masking in the auditory periphery or whether other effects are involved. Thresholds were determined for combinations of two target frequencies (1 and 8 kHz), two spatial configurations (masker and target either co-located or spatially separated by 90 degrees azimuth), and five different masker types (four complex multi-tone stimuli, one noise masker). A spatial separation of target and masker resulted in a release from masking for all masker types. The amount of masking significantly depended on the masker type and frequency range. The various harmonic and inharmonic relations between target and masker or between components of the masker resulted in a complex pattern of increased or decreased masked thresholds in comparison to the predicted energetic masking. The results indicate that harmonicity cues affect the detectability of a tonal target in a complex masker.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.