In reverberant rooms with multiple-people talking, spatial separation between speech sources improves recognition of attended speech, even though both the head-shadowing and interaural-interaction unmasking cues are limited by numerous reflections. It is the perceptual integration between the direct wave and its reflections that bridges the direct-reflection temporal gaps and results in the spatial unmasking under reverberant conditions. This study further investigated (1) the temporal dynamic of the direct-reflection-integration-based spatial unmasking as a function of the reflection delay, and (2) whether this temporal dynamic is correlated with the listeners’ auditory ability to temporally retain raw acoustic signals (i.e., the fast decaying primitive auditory memory, PAM). The results showed that recognition of the target speech against the speech-masker background is a descending exponential function of the delay of the simulated target reflection. In addition, the temporal extent of PAM is frequency dependent and markedly longer than that for perceptual fusion. More importantly, the temporal dynamic of the speech-recognition function is significantly correlated with the temporal extent of the PAM of low-frequency raw signals. Thus, we propose that a chain process, which links the earlier-stage PAM with the later-stage correlation computation, perceptual integration, and attention facilitation, plays a role in spatially unmasking target speech under reverberant conditions.
Human listeners are extraordinarily sensitive to a transient break in interaural correlation (called binaural gap). In this study, a binaural gap embedded in interaurally correlated noise markers elicited marked scalp event-related potentials (ERPs). ERPs to the binaural gap in narrowband noise with the center frequency of 1600 Hz were significantly weaker than those for narrowband noise with the center frequency of 400 or 800 Hz. Introducing the interaural time difference (ITD) of 4 ms weakened the ERPs for either 400-Hz or 800-Hz noise. Introducing the ITD of 2 ms, however, only weakened the ERPs for 800-Hz but not 400-Hz noise. Thus central representations of a transient break in interaural correlation for narrowband noises are affected by both frequency and ITD.
This study investigated whether sound intensity affects listeners' sensitivity to a break in interaural correlation (BIC) embedded in wideband noise at different interaural delays. The results show that the detection duration threshold remained stable at the intensity between 60 and 70 dB SPL, but increased in accelerating fashion as the intensity decreased toward 40 dB SPL. Moreover, the threshold elevated linearly as the interaural delay increased from 0 to 4 ms, and the elevation slope became larger as the intensity decreased from 50 to 40 dB SPL. Thus, detecting the BIC is co-modulated by both intensity and interaural delay.
The subjective representation of the sounds delivered to the two ears of a human listener is closely associated with the interaural delay and correlation of these two-ear sounds. When the two-ear sounds, e.g., arbitrary noises, arrive simultaneously, the single auditory image of the binaurally identical noises becomes increasingly diffuse, and eventually separates into two auditory images as the interaural correlation decreases. When the interaural delay increases from zero to several milliseconds, the auditory image of the binaurally identical noises also changes from a single image to two distinct images. However, measuring the effect of these two factors on an identical group of participants has not been investigated. This study examined the impacts of interaural correlation and delay on detecting a binaurally uncorrelated fragment (interaural correlation = 0) embedded in the binaurally correlated noises (i.e., binaural gap or break in interaural correlation). We found that the minimum duration of the binaural gap for its detection (i.e., duration threshold) increased exponentially as the interaural delay between the binaurally identical noises increased linearly from 0 to 8 ms. When no interaural delay was introduced, the duration threshold also increased exponentially as the interaural correlation of the binaurally correlated noises decreased linearly from 1 to 0.4. A linear relationship between the effect of interaural delay and that of interaural correlation was described for listeners participating in this study: a 1 ms increase in interaural delay appeared to correspond to a 0.07 decrease in interaural correlation specific to raising the duration threshold. Our results imply that a tradeoff may exist between the impacts of interaural correlation and interaural delay on the subjective representation of sounds delivered to two human ears.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.