In the experiments reported here, we attempted to find out more about how the auditory system is able to separate two simultaneous harmonic sounds. Previous research (Halikia & Bregman, 1984a, 1984bScheffers, 1983a) had indicated that a difference in fundamental frequency (FO) between two simultaneous vowel sounds improves their separate identification. In the present experiments, we looked at the effect ofFOs that changed as a function of time. In Experiment 1, pairs of unfiltered or filtered pulse trains were used. Some were steady-state, and others had gliding FOs; different FO separations were also used. The subjects had to indicate whether they had heard one or two sounds. The results showed that increased FO differences and gliding FOs facilitated the perceptual separation of simultaneous sounds. In Experiments 2 and 3, simultaneous synthesized vowels were used on frequency contours that were steady-state, gliding in parallel (parallel glides), or gliding in opposite directions (crossing glides). The results showed that crossing glides led to significantly better vowel identification than did steady-state FOs. Also, in certain cases, crossing glides were more effective than parallel glides. The superior effect of the crossing glides could be due to the common frequency modulation of the harmonics within each component of the vowel pair and the consequent decorrelation of the harmonics between the two simultaneous vowels.In most natural listening situations, at any given moment, the vibrations of our eardrums are the result of several sound sources active at the same time. In such cases, the auditory system is faced with the problem of separating the pattern of superimposed sounds into individual subsets of components that correspond to the separate sound sources. Otherwise, nonveridical percepts will be formed, in each of which some of the properties of the perceived sound will derive from one acoustic source, while others will derive from other sources.The present experiments were designed to examine the effects of two types of grouping cues on the perceptual fusing of certain parts of a spectrum with one another. One of these was the "FO" cue: The spectrum contains partials that can be grouped into two subsets by virtue of the fact that each subset contains the harmonics of a different fundamental (FO). The second was the "common fate" cue: When a set of harmonics of the same FO This article is based on research that formed part of a PhD dissertation, submitted to McGill University by M. H. Halikia (now spelled Chalikia) in 1985. The research was supported by the Natural Sciences and Engineering Research Council of Canada, through a grant awarded to A. S. Bregman. We wish to thankS. McAdams, V. Ciocca, B. Roberts, and two anonymous reviewers for their helpful comments on an earlier version of the manuscript. M. H. Chalikia is now in the Department of Psychology, University of Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53201. Requests for reprints should be sent to Albert S. Bregman,
This experiment was an investigation of the ability of listeners to identify the constituents of double vowels (pairs of synthetic vowels, presented concurrently and binaurally). Three variables were manipulated: (1) the size of the difference in FO between the constituents (0, lh, and 6 semitones); (2) the ft'equency relations among the sinusoids making up the constituents: harmonic, shifted (spaced equally in frequency but not integer multiples ofthe FO), and random; and (3) the relationship between the FO contours imposed on the constituents: steady state, gliding in parallel, or gliding in opposite directions. It was assumed that, in the case of the gliding contours, the harmonics of each vowel would "trace out" their spectral envelope and potentially improve the definition of the formant locations. It was also assumed that the application of different FO contours would introduce differences in the direction of harmonic movement (common fate), thus aiding the perceptual segregation of the two vowels. The major findings were the following: (1) For harmonic constituents, a difference in FO leads to improved identification performance. Neither tracing nor common-fate differences add to the effect of pitch differences. (2) For shifted constituents, a difference between the spacing of the constituents also leads to improved performance. Formant tracing and common fate contribute some further improvement. (3) For random constituents, tracing does not contribute, but common fate does.In most listening situations, we rarely hear a single sound in complete isolation. Several sound sources are often active at the same time, producing a complex pattern of vibrations on our eardrums. The auditory system is, therefore, faced with the problem of distinguishing the different sets of components that correspond to separate sound sources. Otherwise, it would not be possible to understand, for example, what one speaker is saying in the presence of competing speakers or background noises.Different experiments have studied the ability to selectively attend to one speech signal in a mixture of continuous speech signals (Broadbent, 1952;Brokx & Nooteboom, 1982;Cherry, 1953;Darwin, 1981). These studies have suggested that perceptual separation can improve when the signals have different pitches. Scheffers (1983) investigated the effects of fundamental frequency (FO) differences on the identification of two simultaneous steadystate synthetic vowels and confirmed the hypothesis that the two vowels can be more easily separated when the FOs differ by more than 1-2 semitones. Listeners' ability to identify both vowels improved by about 18 % with dif-
Listeners presented with a repeated sequence of brief(30-to 100-msec) steady-state vowels hear phonemic transformations-they cannot identify the vowels, but they perceive two simultaneous utterances that differ in both phonemic content and timbre (Warren, Bashford, & Gardner, 1990). These utterances consist of either English words or syllables that occur in English words. In the present study, we attempted to determine whether the two percepts represent alternative interpretations of the same formant structures, or whether different portions of the vowels are used for each verbal organization. It was found that separate spectral regions are employed for each verbal form: Components below 1500 Hz were generally used for one form, and components above 1500 Hz for the other. Hypotheses are offered concerning the processes responsible for the verbal organization of the vowel sequences and for the splitting into two spectrally limited forms. It appears that the tendency to organize spectral regions separately competes with, and can overcome, the tendency to integrate the different spectral components of speech into a single auditory image. A contralateral induction paradigm was used in a procedure designed to quantitatively evaluate these opposing forces of spectral fission and fusion.
Previous research has shown that listeners presented with repeated sequences of brief steady-state vowels (30–100 msec) experience phonemic transformations (that is, illusory changes in the identity of the constituent speech sounds) and report hearing verbal forms consisting of one or more syllables rather than a succession of vowels. Often the signal is split perceptually into two simultaneous organizations differing in both timbre and phonemic content. The present studies employed sequences of eight 80-msec vowels and mapped the perceptual phonemes to acoustic phones by terminating the repeated sequence at predetermined positions and determining the last speech sound heard in the perceived verbal organization (Experiments 1 and 2). When two simultaneous organizations were heard, they were both mapped (Experiments 2 and 3). It was found that: (1) all listeners reported hearing a polysyllabic verbal organization together with either a noise-like non-linguistic residue or a second verbal organization; (2) the verbal forms heard not only followed the phonotactic rules of English, but also corresponded to syllables actually found in English; (3) when two simultaneous organizations were heard, the primary or more salient one was usually longer; (4) simultaneous organizations had different timbres. Implications concerning the perceptual organization of speech are discussed.
When listeners are presented with pairs of octave-complex tones related by a tritone interval (a half-octave), they hear the pattern as ascending or descending, according to an individual pitch class template. Deutsch (1991) has claimed that this template may be influenced by language. In order to test this hypothesis, data from Greek bilingual listeners were collected and compared with data from Texas, California, and the south of England. The results show significant differences in how Greek listeners hear the tritone stimuli, as compared to listeners in the other groups. There is also evidence that the Greek listeners may have developed two different pitch class templates, possibly representing the influence of English and the influence of Greek.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.