No abstract
An experiment is described in which time and intensity differences of 2-kc high-pass clicks were mutually offset to produce sound images centered in the head. Binaurally correlated and uncorrelated clicks were used, and the trade was tested at 10-70 db SL. The results show that generally the two types of clicks behave similarly, and that up to 60 db SL, at least, as over-all intensity increases, the time difference compensating a given intensity difference (in db) decreases. A function is derived describing what is interpreted as a physiological intensity-to-time conversion. The place of such a conversion in lateralization is discussed.INAURAL interactions lead to a number of psychologically striking phenomena. A central aspect in many of these is the formation of a unitary sound image which occupies a restricted region of the observer's perceptual space. For instance, when a listener, in an acoustic field generated by a localized source, samples the field at two points by means of his two ears, the resultant neural signals are combined to yield a percept projected near the physical location of the source. Furthermore, the subjective image is often identified with the source; this process is usually referred to as "localization." A localized image also appears when the binaural stimuli are delivered through earphones. Here, however, the image appears near or even inside the head and is not usually associated either with any external source or with the earphones. To preserve this distinction, the earphone case is referred to as "lateralization."The foregoing remarks mean that some binaurally heard stimuli may produce a sound image different from that produced by either stimulus heard monaurally. From the psychophysical point of view, of course, such differences arise from interaural relations which are not available from either stimulus alone. One of the general aims of the work reported in this paper is to identify certain interaural relations with one aspect, namely, "centering" of the image. A secondary aim is to deduce from the results something about the physiological mechanisms responsible for the formation of the image. In the latter effort, of course, we rely heavily upon anatomical and neurophysiological work from other sources. Inferences and models resulting from such work had important bearing upon the design of our experiment. BINAURAL INTERACTIONOF COMPLEX STIMULI 77,5 •ot been studied; and, from our preliminary result 5 and that of Leakey, Sayers, and Cherry, ø it appears that coherent signals are not required. What is required, apparently, is a temporal cue to each ear provided by a sharp onset or a discontinuity, or generally, some lowfrequency prominence in the wave-form envelope.These cues need not be embedded in correlated stimuli.
A study of vowel sounds by means of a spectral analysis keyed synchronously to the voice pitch has been carried out. Spectra are obtained by Fourier analysis of individual pitch periods which were established by visual inspection of oscillograms. A digital computer served as the analyzer. The spectra are represented by a pattern of zeros and poles obtained by a process of successive approximation, again carried out by computer. The contributions from vocal tract and glottal source can be uniquely separated and examined. These results show that vowel sounds can be represented by a sequence of poles arising from the vocal tract and a sequence of zeros charactering the izglottal excitation. The frequencies of the vocal tract poles agreed with previous measurements, but the damping factors were not entirely consistent with earlier estimates. The zeros showed approximately uniform frequency spacing, particularly at high frequencies. A theoretical development indicated that this characteristic was to be expected from the known structure of the glottal excitation. The zero pattern was used to estimate the ratio of open-to-closed time for the glottis during voicing. PEECH is the acoustic result of certain events in the vocal apparatus. Indeed, many of the characteristics of these events can be deduced from the acoustical properties of the speech itself. For instance,it has been long realized that speech formants are a reflection of vocal tract resonances, and that the quasi-periodic nature of vocalized speech can be attributed directly to the vocal cord excitation. Usually, such features have been studied with the aid of the sound spectrograph and other special instruments. Though of great utility, these instruments have limitations, both in the range and flexibility of their analysis parameters, and in the dynamic range and resolution of their output displays.With the advent of digital computers as analysis tools, much more sophisticated processing and display have become possible. 2'a This paper presents a basic speech spectrum analysis 4'5 which takes advantage of the greater resources of the computer. The results indicate that this analysis leads to a much more precise acoustic representation of the speech wave than heretofore feasible. Consequently, this representation can be tied quite accurately to vocal tract events.In the first step of the analysis the speech time waveform is segmented into pitch periods which are then subjected to a Fourier expansion. The resulting spectrum is approximated by a number of resonances (poles) and antiresonances (zeros). Utilizing the assumption that each period is one of an infinite sequence of identical periods, this "pitch synchronous" representation can be related easily and precisely to vocal characteristics. In particular, the physical constraints imposed by the known structure of the vocal tract permit the • This paper is the written version of a talk given at the 58th Publishing Company, Amsterdam, 1960). 179 poles and zeros to be assigned uniquely to either the vocal tract...
Can you reliably identify a person by examining the spectrographic patterns of his speech sounds? This is a scientific problem of social consequence because of the interest of the courts in this question. The Technical Committee on Speech Communication of the Acoustical Society of America has asked some of its members to review the matter from a scientific point of view. The topics they considered included the nature of speech information as it relates to speaker identification, a comparison of voice patterns and fingerprint patterns, experimental evidence on voice identification, and requirements for validation of such identification methods. Findings and conclusions are reported; supporting information is given in appendixes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.