When a bilingual switches languages, do they switch their "voice"? Using a new conversational corpus of speech from early Cantonese-English bilinguals (N = 34), this paper examines the talker-specific acoustic signature of bilingual voices. Following prior work in voice quality variation, 24 filter and source-based acoustic measurements are estimated. The analysis summarizes mean differences for these dimensions, in addition to identifying the underlying structure of each talker's voice across languages with principal components analyses. Canonical redundancy analyses demonstrate that while talkers vary in the degree to which they have the same "voice" across languages, all talkers show strong similarity with themselves.
Manipulating speaking and discourse requirements allows us to asses the time-varying correspondences between various subsystems within a talker at different levels of vocal effort. These subsystems include fundamental frequency (F0) and acoustic amplitude, rigid body (6D) motion of the head, motion (2D) of the body, and postural forces and torques measured at the feet. Analysis of six speakers has confirmed our hypothesis that as vocal effort increases coordination among sub-systems simplifies, as shown by greater correspondence (e.g., the instantaneous correlation) between the various time-series measures. However, at the two highest levels of vocal effort, elicited by having talkers shout to and yell at someone located appropriately far away, elements of the postural force, notably one or more torque components, often show a reduction in correspondence with the other measures. We interpret this result as evidence that talkers become more rigidly coordinated at the highest levels of vocal effort, which can interfere with their balance. Furthermore, the discourse type—shouting at someone to carry on a conversation vs. yelling at someone not expected to reply—can be associated with differing amounts of imbalance.
When a bilingual switches languages, do they switch their "voice"? Using a new conversational corpus of speech from early Cantonese-English bilinguals (N = 34), this paper examines the talker-specific acoustic signature of bilingual voices. Following prior work in voice quality variation, 24 filter and source-based acoustic measurements are estimated. The analysis summarizes mean differences for these dimensions, in addition to identifying the underlying structure of each talker's voice across languages with principal components analyses. Canonical redundancy analyses demonstrate that while talkers vary in the degree to which they have the same "voice" across languages, all talkers show strong similarity with themselves.
Work in audiovisual speech processing (AVSP) has established that the availability of visual speech signals can influence auditory perception by improving the intelligibility of speech in noise (Sumby and Pollack, 1954). However, exactly which aspects of visible signals are most responsible for this enhancement remains an open question, although convergent evidence along several lines suggests that visible information may reflect a common articulatory-acoustic temporal signature, and that the multi-modal availability of this temporal signature is at the root of this effect. We evaluated this hypothesis in a perceptual study using simple talking face animations whose motion is driven by a signal derived from the collective motion of perioral structures of an actual talker. We applied spatial and temporal manipulations to the structure of this driving signal using a biologically plausible model that preserves the smoothness of the manipulated trajectory, and tested whether these kinematic manipulations influenced the perception of linguistic prominence, an important component of the timing and rhythm (prosody) of speech. The data suggest that perceivers are sensitive to these manipulations, and that the cross-correlation between the acoustic amplitude envelope and the manipulated visible signal was a strong predictor of the perception of prominence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.