The acoustic structure of the speech signal is extremely variable due to a variety of contextual factors, including talker characteristics and speaking rate. To account for the listener's ability to adjust to this variability, speech researchers have posited the existence of talker and rate normalization processes. The current study examined how the perceptual system encoded information about talker and speaking rate during phonetic perception. Experiments 1-3 examined this question, using a speeded classification paradigm developed by Garner (1974). The results of these experiments indicated that decisions about phonemic identity were affected by both talker and rate information: irrelevant variation in either dimension interfered with phonemic classification, While rate classification was also affected by phoneme variation, talker classification was not. Experiment 4 examined the impact of talker and rate variation on the voicing boundary under different blocking conditions, The results indicated that talker characteristics influenced the voicing boundary when talker variation occurred within a block of trials only under certain conditions, Rate variation, however, influenced the voicing boundary regardless of whether or not there was rate variation within a block of trials. The findings from these experiments indicate that phoneme and rate information are encoded in an integral manner during speech perception, while talker characteristics are encoded separately.Research over the past 30 years has revealed numerous aspects of the acoustic signal that playa role in phonetic perception, Theories of speech perception have attempted toexplain how these acoustic characteristics are processed, integrated, and mapped onto the underlying phonetic representations, Such explanations have been hampered by the complex relationship between the characteristics of the signal and the underlying phonetic representations.
To investigate the extent and locus of integral processing in speech perception, a speeded classification task was utilized with a set of noise-tone analogs of the fricative-vowel syllables (fae), (integral of ae), (fu), and (integral of u). Unlike the stimuli used in previous studies of selective perception of syllables, these stimuli did not contain consonant-vowel transitions. Subjects were asked to classify on the basis of one of the two syllable components. Some subjects were told that the stimuli were computer generated noise-tone sequences. These subjects processed the noise and tone separably. Irrelevant variation of the noise did not affect reaction times (RTs) for the classification of the tone, and vice versa. Other subjects were instructed to treat the stimuli as speech. For these subjects, irrelevant variation of the fricative increased RTs for the classification of the vowel, and vice versa. A second experiment employed naturally spoken fricative-vowel syllables with the same task. Classification RTs showed a pattern of integrality in that irrelevant variation of either component increased RTs to the other. These results indicate that knowledge of coarticulation (or its acoustic consequences) is a basic element of speech perception. Furthermore, the use of this knowledge in phonetic coding is mandatory, even in situations where the stimuli do not contain coarticulatory information.
The classification efficacy of a spectral moments metric was tested on a corpus of voiceless fricatives. The metric classified phonetic identity on the basis of mean, variance, skewness, and kurtosis values derived from cross-sectional spectra. The classification power of both linear- and Bark-based versions of the metric was tested using a corpus of 420 voiceless fricatives (/f/, /θ/, /s/, /∫/, /h/) obtained from multiple talkers and vowel environments. Discriminant function analyses performed on linear and Bark moment profiles of well-identified fricative tokens resulted in overall classification accuracies of 78% and 74%, respectively. Classification accuracies were substantially higher when the nonsibilant (/f/ and /θ/) data were excluded. In an attempt to improve metric classification of nonsibilant tokens, an experiment designed to identify the perceptually appropriate location of a fricative analysis window was performed. The results suggested that nonsibilant classification performance of the metric may be substantially improved when moment information is based upon the spectral information contained within the onset and/or offset (transition) portion of the frication. [Work supported by NINCDS Grant NS19653 and NIDCD Grant DC00219 to SUNY at Buffalo.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.