Searching for an object within a cluttered, continuously changing environment can be a very time-consuming process. The authors show that a simple auditory pip drastically decreases search times for a synchronized visual object that is normally very difficult to find. This effect occurs even though the pip contains no information on the location or identity of the visual object. The experiments also show that the effect is not due to general alerting (because it does not occur with visual cues), nor is it due to top-down cuing of the visual change (because it still occurs when the pip is synchronized with distractors on the majority of trials). Instead, we propose that the temporal information of the auditory signal is integrated with the visual signal, generating a relatively salient emergent feature that automatically draws attention. Phenomenally, the synchronous pip makes the visual object pop out from its complex environment, providing a direct demonstration of spatially nonspecific sounds affecting competition in spatial visual processing.
How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.
A study was made of the effect of interaural time delay (ITD) and acoustic headshadow on binaural speech intelligibility in noise. A free-field condition was simulated by presenting recordings, made with a KEMAR manikin in an anechoic room, through earphones. Recordings were made of speech, reproduced in front of the manikin, and of noise, emanating from seven angles in the azimuthal plane, ranging from 0 degree (frontal) to 180 degrees in steps of 30 degrees. From this noise, two signals were derived, one containing only ITD, the other containing only interaural level differences (ILD) due to headshadow. Using this material, speech reception thresholds (SRT) for sentences in noise were determined for a group of normal-hearing subjects. Results show that (1) for noise azimuths between 30 degrees and 150 degrees, the gain due to ITD lies between 3.9 and 5.1 dB, while the gain due to ILD ranges from 3.5 to 7.8 dB, and (2) ILD decreases the effectiveness of binaural unmasking due to ITD (on the average, the threshold shift drops from 4.6 to 2.6 dB). In a second experiment, also conducted with normal-hearing subjects, similar stimuli were used, but now presented monaurally or with an overall 20-dB attenuation in one channel, in order to simulate hearing loss. In addition, SRTs were determined for noise with fixed ITDs, for comparison with the results obtained with head-induced (frequency dependent) ITDs.(ABSTRACT TRUNCATED AT 250 WORDS)
Speech-reception thresholds (SRT) were measured for 17 normal-hearing and 17 hearing-impaired listeners in conditions simulating free-field situations with between one and six interfering talkers. The stimuli, speech and noise with identical long-term average spectra, were recorded with a KEMAR manikin in an anechoic room and presented to the subjects through headphones. The noise was modulated using the envelope fluctuations of the speech. Several conditions were simulated with the speaker always in front of the listener and the maskers either also in front, or positioned in a symmetrical or asymmetrical configuration around the listener. Results show that the hearing impaired have significantly poorer performance than the normal hearing in all conditions. The mean SRT differences between the groups range from 4.2-10 dB. It appears that the modulations in the masker act as an important cue for the normal-hearing listeners, who experience up to 5-dB release from masking, while being hardly beneficial for the hearing impaired listeners. The gain occurring when maskers are moved from the frontal position to positions around the listener varies from 1.5 to 8 dB for the normal hearing, and from 1 to 6.5 dB for the hearing impaired. It depends strongly on the number of maskers and their positions, but less on hearing impairment. The difference between the SRTs for binaural and best-ear listening (the "cocktail party effect") is approximately 3 dB in all conditions for both the normal-hearing and the hearing-impaired listeners.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.