The problem of estimating the location of a sound source is described which is based on signals observed at the entrances of the two ears. The purpose is to specify the talker’s and listener’s positions within a car using the binaural signal. The talker and the listener sit in two of the four car seats. In this experiment, two kinds of head and torso simulators are used as a talker and a listener. Given information includes the acoustic transfer functions for all positional patterns. Eight patterns of acoustic transfer functions are measured, involving those that have the same positional pattern, but in which the talker faces a different direction. A Gaussian mixture model for each positional pattern is generated. The parameters we used are interaural information such as the envelope of an interaural level difference. The models are evaluated by specifying the positional pattern. Results show that we can specify positions with up to 97% (35/36) accuracy using the binaural signals of two men. Then the input signal was expanded to one with background noise that resembles a real situation, and a model that involves motion of the talker’s head was also considered.
In this paper, we examine how covering one or both external ears affects sound localization on the horizontal plane. In our experiments, we covered subjects' pinnae and external auditory canals with headphones, earphones, and earplugs, and conducted sound localization tests. Stimuli were presented from 12 different directions, and 12 subjects participated in the sound localization tests. The results indicate that covering one or both ears decreased their sound localization performance. Front-back confusion rates increased, particularly when covering both outer ears with open-air headphones or covering one ear with an intraconcha-type earphone or an earplug. Furthermore, incorrect answer rates were high when the sound source and the occluded ear that had an intraconcha-type earphone or an earplug were on the same side. We consider that the factors that cause poor performance can be clarified by comparing these results with characteristics of head-related transfer function.
IntroductionThe detection of sound source direction is a very important technique and is in wide use in fields such as speech enhancement, sound recording, and security systems. Robot hearing is an important subject related to the detection of sound source direction [1,2]. It is necessary for technology to achieve the same performance as human beings in various environments. To date there have been many studies based on microphone arrays; these reports describe methods that employ many microphones to obtain high detection performance. However, reducing the number of microphones could contribute to lowering costs and facilitating maintenance.In this paper, speaker and listener positions are estimated by a binaural signal between them in noisy conditions. Several methods of detecting sound source direction using binaural signals have been proposed. In our research, the method using the cepstrum of an interaural level difference (ILD) [3] was employed. In that study, experiments for estimating sound source direction were conducted in eight reverberation conditions, and the results showed that the cepstrum of ILD was a useful feature parameter. However, evaluating not only user positions but also user situations and environments is an important problem for robot hearing. Our experiments were conducted for an in-car environment [4], which can be considered a complex acoustic environment, because it is a limited space that contains many acoustic materials with various shapes. Moreover, a speaker talks behind a listener or without looking at a listener in a car. Although speaker and listener positions do not change, different transfer functions exist between them. Therefore, the purpose of our investigation is to specify the speaker and listener positions within a car using binaural signals. Positional patterns that involve motion of the speaker's head are also considered. The given information includes a statistical model of the ILD cepstrum for all positional patterns. A Gaussian mixture model for each positional pattern is generated. The models are evaluated by specifying the positional pattern in six binaural signal-to-noise ratio conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.