The current work examines native Korean speakers’ perception and production of stop contrasts in their native language (L1, Korean) and second language (L2, English), focusing on three acoustic dimensions that are all used, albeit to different extents, in both languages: voice onset time (VOT), f0 at vowel onset, and closure duration. Participants used all three cues to distinguish the L1 Korean three-way stop distinction in both production and perception. Speakers’ productions of the L2 English contrasts were reliably distinguished using both VOT and f0 (even though f0 is only a very weak cue to the English contrast), and, to a lesser extent, closure duration. In contrast to the relative homogeneity of the L2 productions, group patterns on a forced-choice perception task were less clear-cut, due to considerable individual differences in perceptual categorization strategies, with listeners using either primarily VOT duration, primarily f0, or both dimensions equally to distinguish the L2 English contrast. Differences in perception, which were stable across experimental sessions, were not predicted by individual variation in production patterns. This work suggests that reliance on multiple cues in representation of a phonetic contrast can form the basis for distinct individual cue-weighting strategies in phonetic categorization.
Listeners possess a remarkable ability to adapt to acoustic variability in the realization of speech sound categories (e.g. different accents). The current work tests whether non-native listeners adapt their use of acoustic cues in phonetic categorization when they are confronted with changes in the distribution of cues in the input, as native listeners do, and examines to what extent these adaptation patterns are influenced by individual cue-weighting strategies. In line with previous work, native English listeners, who use VOT as a primary cue to the stop voicing contrast (e.g. ‘pa’ vs. ‘ba’), adjusted their use of f0 (a secondary cue to the contrast) when confronted with a noncanonical “accent” in which the two cues gave conflicting information about category membership. Native Korean listeners’ adaptation strategies, while variable, were predictable based on their initial cue weighting strategies. In particular, listeners who used f0 as the primary cue to category membership adjusted their use of VOT (their secondary cue) in response to the noncanonical accent, mirroring the native pattern of “downweighting” a secondary cue. Results suggest that non-native listeners show native-like sensitivity to distributional information in the input and use this information to adjust categorization, just as native listeners do, with the specific trajectory of category adaptation governed by initial cue-weighting strategies.
Speech sound contrasts differ along multiple phonetic dimensions. During speech perception, listeners must decide which cues are relevant, and determine the relative importance of each cue, while also integrating other, signal-external cues. The comparison of cue weighting in perception and production bears on a range of theoretical issues including the processes underlying sound change, the time course of learning, the nature of cues, and the perception-production interface. Research examining the relative alignment of cue weighting across the modalities, on both a community and individual level, has revealed both parallels and asymmetries between the modalities. The extraordinarily wide range of ways that have been used to conceptualize and quantify cue weights reflects the inherent theoretical, methodological, and analytical differences between the two modalities. More consideration of the choices of analytical metrics, explicit discussion of the theoretical assumptions that underlie them, and systematic investigations of different types of cues will lead to more generalizable findings that can be incorporated into computational implementable models of speech processing. This article is categorized under:Linguistics > Language in Mind and Braincue weighting, phonetics, speech perception, speech production | INTRODUCTIONSpeech sound contrasts differ on multiple phonetic dimensions: for example, the English sounds /b/ and /p/ differ systematically in voice onset time (VOT, the amount of time between the stop release and onset of voicing), but also differ, albeit less reliably, in other dimensions including the duration of the stop closure and the fundamental frequency (f0, corresponding to perceived pitch) at the onset of voicing (Lisker, 1986). Phonetic cue weighting, or the relative use of these acoustic "cues," can be conceptualized and quantified in the context of both production and perception. For example, as shown in Figure 1, English speakers' productions of /b/ and /p/ show large and consistent differences in VOT, such that /b/ vs. /p/ category membership can by well-predicted using this cue alone. On the other hand, while /b/ is followed by a lower f0 than /p/ on average, this difference is much smaller and less consistent, such that f0 is only weakly predictive of category membership. This asymmetry between the two cues is reflected in perception: when asked to categorize sounds varying in aspiration duration and f0, listeners' responses are mainly determined by aspiration (the primary cue), with the value of f0 playing a secondary, albeit still detectable, role (Abramson & Lisker, 1985).
The purpose of this study was to identify characteristics of typical acquisition of the Mexican Spanish stop-spirant alternation in bilingual Spanish–English speaking children and to shed light on the theoretical debate over which sound is the underlying form in the stop-spirant allophonic relationship. We predicted that bilingual children would acquire knowledge of this allophonic relationship by the time they reach age 5;0 (years;months) and would demonstrate higher accuracy on the spirants, indicating their role as the underlying phoneme. This quasi-longitudinal study examined children’s single word samples in Spanish from ages 2;4–8;2. Samples were phonetically transcribed and analyzed for accuracy, substitution errors and acoustically for intensity ratios. Bilingual children demonstrated overall higher accuracy on the voiced stops as compared to the spirants. Differences in substitution errors across ages were found and acoustic analyses corroborated perceptual findings. The clinical implication of this research is that bilingual children may be in danger of overdiagnosis of speech sound disorders because acquisition of this allophonic rule in bilinguals appears to differ from what has been found in previous studies examining monolingual Spanish speakers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.