Acoustic sequences such as speech and music are generally perceived as coherent auditory "streams," which can be individually attended to and followed over time. Although the psychophysical stimulus parameters governing this "auditory streaming" are well established, the brain mechanisms underlying the formation of auditory streams remain largely unknown. In particular, an essential feature of the phenomenon, which corresponds to the fact that the segregation of sounds into streams typically takes several seconds to build up, remains unexplained. Here, we show that this and other major features of auditory-stream formation measured in humans using alternating-tone sequences can be quantitatively accounted for based on single-unit responses recorded in the primary auditory cortex (A1) of awake rhesus monkeys listening to the same sound sequences.
A striking feature of human perception is that our subjective experience depends not only on sensory information from the environment but also on our prior knowledge or expectations. The precise mechanisms by which sensory information and prior knowledge are integrated remain unclear, with longstanding disagreement concerning whether integration is strictly feedforward or whether higherlevel knowledge influences sensory processing through feedback connections. Here we used concurrent EEG and MEG recordings to determine how sensory information and prior knowledge are integrated in the brain during speech perception. We manipulated listeners' prior knowledge of speech content by presenting matching, mismatching, or neutral written text before a degraded (noise-vocoded) spoken word. When speech conformed to prior knowledge, subjective perceptual clarity was enhanced. This enhancement in clarity was associated with a spatiotemporal profile of brain activity uniquely consistent with a feedback process: activity in the inferior frontal gyrus was modulated by prior knowledge before activity in lower-level sensory regions of the superior temporal gyrus. In parallel, we parametrically varied the level of speech degradation, and therefore the amount of sensory detail, so that changes in neural responses attributable to sensory information and prior knowledge could be directly compared. Although sensory detail and prior knowledge both enhanced speech clarity, they had an opposite influence on the evoked response in the superior temporal gyrus. We argue that these data are best explained within the framework of predictive coding in which sensory activity is compared with top-down predictions and only unexplained activity propagated through the cortical hierarchy.
A series of experiments investigated the influence of harmonic resolvability on the pitch of, and the discriminability of differences in fundamental frequency (F0) between, frequency-modulated (FM) harmonic complexes. Both F0 (62.5 to 250 Hz) and spectral region (LOW: 125-625 Hz, MID: 1375-1875 Hz, and HIGH: 3900-5400 Hz) were varied orthogonally. The harmonics that comprised each complex could be summed in either sine (0 degree) phase (SINE) or alternating sine-cosine (0 degree-90 degrees) phase (ALT). Stimuli were presented in a continuous pink-noise background. Pitch-matching experiments revealed that the pitch of ALT-phase stimuli, relative to SINE-phase stimuli, was increased by an octave in the HIGH region, for all F0's, but was the same as that of SINE-phase stimuli when presented in the LOW region. In the MID region, the pitch of ALT-phase relative to SINE-phase stimuli depended on F0, being an octave higher at low F0's, equal at high F0's, and unclear at intermediate F0's. The same stimuli were then used in three measures of discriminability: FM detection thresholds (FMTs), frequency difference limens (FDLs), and FM direction discrimination thresholds (FMDDTs, defined as the minimum FM depth necessary for listeners to discriminate between two complexes modulated 180 degrees out of phase with each other). For all three measures, at all F0's, thresholds were low (< 4% for FMTs, < 5% for FMDDTs, and < 1.5% for FDLs) when stimuli were presented in the LOW region, and high (> 10% for FMTs, > 7% for FMDDTs, and > 2.5% for FDLs) when presented in the HIGH region. When stimuli were presented in the MID region, thresholds were low for low F0's, and high for high F0's. Performance was not markedly affected by the phase relationship between the components of a complex, except for stimuli with intermediate F0's in the MID spectral region, where FDLs and FMDDTs were much higher for ALT-phase stimuli than for SINE-phase stimuli, consistent with their unclear pitch. This difference was much smaller when FMTs were measured. The interaction between F0 and spectral region for both sets of experiments can be accounted for by a single definition of resolvability.
Often, the sound arriving at the ears is a mixture from many different sources, but only 1 is of interest. To assist with selection, the auditory system structures the incoming input into streams, each of which ideally corresponds to a single source. Some authors have argued that this process of streaming is automatic and invariant, but recent evidence suggests it is affected by attention. In Experiments 1 and 2, it is shown that the effect of attention is not a general suppression of streaming on an unattended side of the ascending auditory pathway or in unattended frequency regions. Experiments 3 and 4 investigate the effect on streaming of physical gaps in the sequence and of brief switches in attention away from a sequence. The results demonstrate that after even short gaps or brief switches in attention, streaming is reset. The implications are discussed, and a hierarchical decomposition model is proposed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.