This paper investigates the cues used by the auditory system in the perceptual organization of sequential sounds. In particular, the ability to organize sounds in the absence of spectral cues is studied. In the first experiment listeners were presented with a tone sequence ABA ABA ..., where the fundamental frequency (f0) of tone A was fixed at 100 Hz and the f0 difference between tones A and B varied across trials between 1 and 11 semitones. Three spectral conditions were tested: pure tones, harmonic complexes filtered with a bandpass region between 500 and 2000 Hz, and harmonic complexes filtered with a bandpass region chosen so that only harmonics above the tenth would be passed by the filter, thus severely limiting spectral information. Listeners generally reported that they could segregate tones A and B into two separate perceptual streams when the f0 interval exceeded about four semitones. This was true for all conditions. The second experiment showed that most listeners were better able to recognize a short atonal melody interleaved with random distracting tones when the distracting tones were in an f0 region 11 semitones higher than the melody than when the distracting tones were in the same f0 region. The results were similar for both pure tones and complex tones comprising only high, unresolved harmonics. The results from both experiments show that spectral separation is not a necessary condition for perceptual stream segregation. This suggests that models of stream segregation that are based solely on spectral properties may require some revision.
Human sound localization relies on implicit head-centered acoustic cues. However, to create a stable and accurate representation of sounds despite intervening head movements, the acoustic input should be continuously combined with feedback signals about changes in head orientation. Alternatively, the auditory target coordinates could be updated in advance by using either the preprogrammed gaze-motor command or the sensory target coordinates to which the intervening gaze shift is made ("predictive remapping"). So far, previous experiments cannot dissociate these alternatives. Here, we study whether the auditory system compensates for ongoing saccadic eye and head movements in two dimensions that occur during target presentation. In this case, the system has to deal with dynamic changes of the acoustic cues as well as with rapid changes in relative eye and head orientation that cannot be preprogrammed by the audiomotor system. We performed visual-auditory double-step experiments in two dimensions in which a brief sound burst was presented while subjects made a saccadic eye-head gaze shift toward a previously flashed visual target. Our results show that localization responses under these dynamic conditions remain accurate. Multiple linear regression analysis revealed that the intervening eye and head movements are fully accounted for. Moreover, elevation response components were more accurate for longer-duration sounds (50 msec) than for extremely brief sounds (3 msec), for all localization conditions. Taken together, these results cannot be explained by a predictive remapping scheme. Rather, we conclude that the human auditory system adequately processes dynamically varying acoustic cues that result from self-initiated rapid head movements to construct a stable representation of the target in world coordinates. This signal is subsequently used to program accurate eye and head localization responses.
In a previous paper, it was shown that sequential stream segregation could be based on both spectral information and periodicity information, if listeners were encouraged to hear segregation [Vliegen and Oxenham, J. Acoust. Soc. Am. 105, 339-346 (1999)]. The present paper investigates whether segregation based on periodicity information alone also occurs when the task requires integration. This addresses the question: Is segregation based on periodicity automatic and obligatory? A temporal discrimination task was used, as there is evidence that it is difficult to compare the timing of auditory events that are perceived as being in different perceptual streams. An ABA ABA ABA... sequence was used, in which tone B could be either exactly at the temporal midpoint between two successive tones A or slightly delayed. The tones A and B were of three types: (1) both pure tones; (2) both complex tones filtered through a fixed passband so as to contain only harmonics higher than the 10th, thereby eliminating detectable spectral differences, where only the fundamental frequency (f0) was varied between tones A and B; and (3) both complex tones with the same f0, but where the center frequency of the spectral passband varied between tones. Tone A had a fixed frequency of 300 Hz (when A and B were pure tones) or a fundamental frequency (f0) of 100 Hz (when A and B were complex tones). Five different intervals, ranging from 1 to 18 semitones, were used. The results for all three conditions showed that shift thresholds increased with increasing interval between tones A and B, but the effect was largest for the conditions where A and B differed in spectrum (i.e., the pure-tone and the variable-center-frequency conditions). The results suggest that spectral information is dominant in inducing (involuntary) segregation, but periodicity information can also play a role.
The localization of sounds in the vertical plane ͑elevation͒ deteriorates for short-duration wideband sounds at moderate to high intensities. The effect is described by a systematic decrease of the elevation gain ͑slope of stimulus-response relation͒ at short sound durations. Two hypotheses have been proposed to explain this finding. Either the sound localization system integrates over a time window that is too short to accurately extract the spectral localization cues ͑neural integration hypothesis͒, or the effect results from cochlear saturation at high intensities ͑adaptation hypothesis͒. While the neural integration model predicts that elevation gain is independent of sound level, the adaptation hypothesis holds that low elevation gains for short-duration sounds are only obtained at high intensities. Here, these predictions are tested over a larger range of stimulus parameters than has been done so far. Subjects responded with rapid head movements to noise bursts in the two-dimensional frontal space. Stimulus durations ranged from 3 to 100 ms; sound levels from 26 to 73 dB SPL. Results show that the elevation gain decreases for short noise bursts at all sound levels, a finding that supports the integration model. On the other hand, the short-duration gain also decreases at high sound levels, which is in line with the adaptation hypothesis. The finding that elevation gain was a nonmonotonic function of sound level for all sound durations, however, is predicted by neither model. It is concluded that both mechanisms underlie the elevation gain effect and a conceptual model is proposed to reconcile these findings.
In this paper, two experiments are presented that were aimed to investigate the effects of stereoscopic filming parameters and display duration on observers' judgements of naturalness and quality of stereoscopic images. The paper first presents a literature review of temporal factors in stereoscopic vision, with reference to stereoscopic displays. Several studies have indicated an effect of display duration on performance-oriented (criterion based) measures. The experiments reported here were performed to extend the study of display duration from performance to appreciation-oriented measures. In addition, the present study aimed to investigate the effects of manipulating camera separation, convergence distance, and focal length on perceived quality and naturalness.In the first experiment, using display durations of both 5 and 10 s, 12 observers rated naturalness of depth and quality of depth for stereoscopic still images. The results showed no significant main effect of display duration. A small yet significant shift between naturalness and quality was found for both duration conditions. This result replicated earlier findings, indicating that this is a reliable effect, albeit content-dependent. The second experiment was performed using display durations ranging from 1 to 15 s. The results of this experiment showed a small yet significant effect of display duration. Whereas longer display durations do not have a negative impact on the appreciative scores of optimally reproduced stereoscopic images, observers do give lower judgements to monoscopic images and stereoscopic images with unnatural disparity values as display duration increases. In addition, the results of both experiments provide support for the argument that stereoscopic camera toe-in should be avoided if possible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.