Recent studies of auditory streaming have suggested that repeated synchronous onsets and offsets over time, referred to as "temporal coherence," provide a strong grouping cue between acoustic components, even when they are spectrally remote. This study uses a measure of auditory stream formation, based on comodulation masking release (CMR), to assess the conditions under which a loss of temporal coherence across frequency can lead to auditory stream segregation. The measure relies on the assumption that the CMR, produced by flanking bands remote from the masker and target frequency, only occurs if the masking and flanking bands form part of the same perceptual stream. The masking and flanking bands consisted of sequences of narrowband noise bursts, and the temporal coherence between the masking and flanking bursts was manipulated in two ways: (a) By introducing a fixed temporal offset between the flanking and masking bands that varied from zero to 60 ms and (b) by presenting the flanking and masking bursts at different temporal rates, so that the asynchronies varied from burst to burst. The results showed reduced CMR in all conditions where the flanking and masking bands were temporally incoherent, in line with expectations of the temporal coherence hypothesis.
The perceptual organization of two-tone sequences into auditory streams was investigated using a modeling framework consisting of an auditory pre-processing front end [Dau et al., J. Acoust. Soc. Am. 102, 2892-2905 (1997)] combined with a temporal coherence-analysis back end [Elhilali et al., Neuron 61, 317-329 (2009)]. Two experimental paradigms were considered: (i) Stream segregation as a function of tone repetition time (TRT) and frequency separation (Δf) and (ii) grouping of distant spectral components based on onset/offset synchrony. The simulated and experimental results of the present study supported the hypothesis that forward masking enhances the ability to perceptually segregate spectrally close tone sequences. Furthermore, the modeling suggested that effects of neural adaptation and processing though modulation-frequency selective filters may enhance the sensitivity to onset asynchrony of spectral components, facilitating the listeners' ability to segregate temporally overlapping sounds into separate auditory objects. Overall, the modeling framework may be useful to study the contributions of bottom-up auditory features on "primitive" grouping, also in more complex acoustic scenarios than those considered here.
The ability to perceptually separate acoustic sources and focus one's attention on a single source at a time is essential for our ability to use acoustic information. In this study, a physiologically inspired model of human auditory processing Jepsenetal., 2008 was used as a front end of a model for auditory stream segregation. A temporal coherence analysis Elhilalietal., 2009 was applied at the output of the preprocessing, using the coherence across tonotopic channels to group activity across frequency. Using this approach, the described model is able to quantitatively account for classical streaming phenomena relying on frequency separation and tone presentation rate, such as the temporal coherence boundary and the fission boundary vanNoorden, 1975. The same model also accounts for the perceptual grouping of distant spectral components in the case of synchronous presentation. The most essential components of the front-end and back-end processing in the framework of the presented model are analyzed and future perspectives discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.