Several studies have demonstrated that individuals’ ability to perceive a speech sound contrast is related to the production of that contrast in their native language. The theoretical account for this relationship is that speech perception and production have a shared multimodal representation in relevant sensory spaces (e.g., auditory and somatosensory domains). This gives rise to a prediction that individuals with more narrowly defined targets will produce greater separation between contrasting sounds, as well as lower variability in the production of each sound. However, empirical studies that tested this hypothesis, particularly with regard to variability, have reported mixed outcomes. The current study investigates the relationship between perceptual ability and production ability, focusing on the auditory domain. We examined whether individuals’ categorical labeling consistency for the American English /ε/–/æ/ contrast, measured using a perceptual identification task, is related to distance between the centroids of vowel categories in acoustic space (i.e., vowel contrast distance) and to two measures of production variability: the overall distribution of repeated tokens for the vowels (i.e., area of the ellipse) and the proportional within-trial decrease in variability as defined as the magnitude of self-correction to the initial acoustic variation of each token (i.e., centering ratio). No significant associations were found between categorical labeling consistency and vowel contrast distance, between categorical labeling consistency and area of the ellipse, or between categorical labeling consistency and centering ratio. These null results suggest that the perception-production relation may not be as robust as suggested by a widely adopted theoretical framing in terms of the size of auditory target regions. However, the present results may also be attributable to choices in implementation (e.g., the use of model talkers instead of continua derived from the participants’ own productions) that should be subject to further investigation.
Purpose Previous studies have demonstrated that speakers can learn novel speech sequences, although the content and specificity of the learned speech motor representations remain incompletely understood. We investigated these representations by examining transfer of learning in the context of nonnative consonant clusters. Specifically, we investigated whether American English speakers who learn to produce either voiced or voiceless stop–stop clusters (e.g., /gd/ or /kt/) exhibit transfer to the other voicing pattern. Method Each participant ( n = 34) was trained on disyllabic nonwords beginning with either voiced (/gd/, /db/, /gb/) or voiceless (/kt/, /kp/, /tp/) onset consonant clusters (e.g., /gdimu/, /ktaksnæm/) in a practice-based speech motor learning paradigm. All participants were tested on both voiced and voiceless clusters at baseline (prior to practice) and in two retention sessions (20 min and 2 days after practice). We compared changes in cluster accuracy and burst-to-burst duration between baseline and each retention session to evaluate learning (performance on the trained clusters) and transfer (performance on the untrained clusters). Results Participants in both training conditions improved with respect to cluster accuracy and burst-to-burst duration for the clusters they practiced on. A bidirectional transfer pattern was found, such that participants also improved the cluster accuracy and burst-to-burst duration for the clusters with the other untrained voicing pattern. Post hoc analyses also revealed that improvement in the production of untrained stop–fricative clusters that originally were added as filler items. Conclusion Our findings suggest the learned speech motor representations may encode the information about the coordination of oral articulators for stop–stop clusters independently from information about the coordination of oral and laryngeal articulators.
Purpose: Nonnative consonant cluster learning has become a useful experimental approach for learning about speech motor learning, and we sought to enhance our understanding of this area and to establish best practices for this type of research. Method: One hundred twenty individuals completed a nonnative consonant cluster learning task within a speech motor learning paradigm. Following a brief prepractice, participants then practiced the production of eight word-initial nonnative consonant clusters embedded in bisyllabic nonwords (e.g., GD in /gdivu/). The clusters ranged in difficulty according to linguistic typology and sonority sequencing. Acquisition was operationalized as the change across the practice section and learning was assessed with two retention sessions (R1: 30 min after practice; R2: 2 days after practice). We evaluated changes in accuracy as well as in the acoustic details of the cluster production at each time point. Results: Overall, participants improved in their production of the consonant clusters. Accuracy increased, and duration measures decreased in specific measures associated with cluster production. The change in coordination measured in the acoustics changed both for clusters that were incorrectly produced and for those that were correctly produced, indicating continued motor learning even in accurate tokens. Conclusions: These results aid our understanding of the complexity of nonnative consonant cluster learning. In particular, both factors related to both phonological and speech motor control properties affect the learning of novel speech sequences. Supplemental Material: https://doi.org/10.23641/asha.21844185
Cross-language studies of speech production have shown that English speakers can produce phonotactically illegal onset fricative-nasal clusters (e.g., /fn/) with high accuracy based on acoustic analyses. However, it remains unclear whether the articulatory gestures affiliated with the fricative-nasal segments are produced with comparable gestural timing to native onset clusters (e.g., /sm/, /fl/). We here used electromagnetic articulography (EMA) to investigate whether the production of non-native /fn/ onset clusters exhibits a comparable amount of consonant-vowel gestural overlap to native onset clusters (i.e., /sm/ and /fl/) in nonwords. Previous articulatory investigations have demonstrated that native English onset clusters exhibit an increase in gestural overlap between the vowel-adjacent consonant and the vowel (e.g., SMAGDEEP) when compared to the corresponding singleton (MAGDEEP). We controlled for the vocal tract configuration (e.g., jaw position) by comparing each onset cluster to heterosyllabic sequences consisting of the same consonant sequences (e.g., /sm/ vs /s#m/). While the data collection is still ongoing, the preliminary results (n = 3) suggest that when the complexity of the onset increased, vowel-adjacent consonantal gestures showed greater temporal overlap with vocalic gestures for phonotactically legal sequences (SMAGDEEP versus MAGDEEP) but not illegal sequences (FNAGDEEP versus NAGDEEP).
Previous studies have demonstrated that American English speakers can improve their production of phonotactically illegal onset clusters (e.g., DBEEGO) after structured practice. However, the nature of what is learned remains incompletely understood. We use a transfer paradigm to address this question by examining performance on trained and untrained novel consonant sequences. In particular, we investigated whether the differences between voiced and voiceless stop-stop clusters (e.g., /gd/ vs. /kt/) influences transfer of learning, hypothesizing that the voiced clusters involve more complex motor control. Forty native speakers of American English practiced nonwords beginning with either voiced (/gd/, /db/, /gb/) or voiceless (/kt/, /kp/, /tp/) stop-stop onset clusters. All participants were tested on both types of clusters at baseline (prior to practice) and in two retention sessions (20 minutes (R1) and 2 days (R2) after practice). Blinded coders rated cluster accuracy based on presence of a vowel in the acoustics. Preliminary results (n = 10) indicate a trend of bi-directional transfer, with participants in both practice conditions exhibiting improved accuracy for both trained and untrained clusters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.