Audiovisual Mandarin Lexical Tone Perception in Quiet and Noisy Contexts: The Influence of Visual Cues and Speech Rate

Man-hong, LI; Chen, Xiaoxiang; Zhu, Jiaqiang; Chen, Fei

doi:10.1044/2022_jslhr-22-00024

Cited by 4 publications

(4 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Particularly, for Tone 3, the larger and faster head-raising movement after head dipping in clear than plain speech is aligned with the dynamic nature of Tone 3 (with a falling-rising pitch contour), thus enhancing the Tone 3-specific characteristics. This is consistent with the previous findings of a greater accuracy in visual perception of Tone 3 compared to that of the other tones in Mandarin ( Hannah et al, 2017 ; Li et al, 2022 ). In contrast, in Tone 4 clear speech, a larger and faster head raise occurred after head falling (that is, after the completion of Tone 4), which consequently approximated a Tone 3 movement trajectory and caused confusion.…”

Section: Discussionsupporting

confidence: 93%

“…Further research has shown that, similar to the observation for segmental distinctions (Sumby and Pollack, 1954), visual tonal information may become more prominent in situations where auditory information is degraded and more difficult to access, such as in the presence of background noise or with a hearing impairment. For example, for Mandarin and Thai tones, while there was no difference in native perceivers' identification in the AV and the AO modes, an advantage for the AV mode over the AO mode became apparent when the same stimuli were presented in babble or cafeteria noise (Mixdorff et al, 2005;Burnham et al, 2015;Hannah et al, 2017;Li et al, 2022). Similarly, when the acoustic signal of Mandarin tones was degraded to resemble cochlear-implant (CI) speech, Mandarin perceivers did significantly better in the CI-simulated AV condition than in the CI-simulated AO condition (Smith and Burnham, 2012).…”

Section: Tone Perception and Visual Cuesmentioning

confidence: 99%

“…The existence of phoneme-specific visual information in tone is supported by the finding that not all tones benefit equally from the presence of the speaker's face. For example, visual gain in the AV perception of both Cantonese (Burnham et al, 2001a) and Mandarin (Mixdorff et al, 2005;Hannah et al, 2017;Li et al, 2022) tones has been found for the more dynamic contour tones (e.g., the dipping tone and the falling tone in Mandarin). Smith and Burnham (2012) found that in CI speech where pitch information is not available, pairings involving the dipping tone were better discriminated and this advantage was more pronounced in the AV condition.…”

Section: Tone Perception and Visual Cuesmentioning

confidence: 99%

See 2 more Smart Citations

Multi-modal cross-linguistic perception of Mandarin tones in clear speech

Zeng,

Leung,

Jongman

et al. 2023

Front. Hum. Neurosci.

View full text Add to dashboard Cite

Clearly enunciated speech (relative to conversational, plain speech) involves articulatory and acoustic modifications that enhance auditory–visual (AV) segmental intelligibility. However, little research has explored clear-speech effects on the perception of suprasegmental properties such as lexical tone, particularly involving visual (facial) perception. Since tone production does not primarily rely on vocal tract configurations, tones may be less visually distinctive. Questions thus arise as to whether clear speech can enhance visual tone intelligibility, and if so, whether any intelligibility gain can be attributable to tone-specific category-enhancing (code-based) clear-speech cues or tone-general saliency-enhancing (signal-based) cues. The present study addresses these questions by examining the identification of clear and plain Mandarin tones with visual-only, auditory-only, and AV input modalities by native (Mandarin) and nonnative (English) perceivers. Results show that code-based visual and acoustic clear tone modifications, although limited, affect both native and nonnative intelligibility, with category-enhancing cues increasing intelligibility and category-blurring cues decreasing intelligibility. In contrast, signal-based cues, which are extensively available, do not benefit native intelligibility, although they contribute to nonnative intelligibility gain. These findings demonstrate that linguistically relevant visual tonal cues are existent. In clear speech, such tone category-enhancing cues are incorporated with saliency-enhancing cues across AV modalities for intelligibility improvements.

show abstract

Section: Discussionsupporting

confidence: 93%

Section: Tone Perception and Visual Cuesmentioning

confidence: 99%

Section: Tone Perception and Visual Cuesmentioning

confidence: 99%

See 1 more Smart Citation

Multi-modal cross-linguistic perception of Mandarin tones in clear speech

Zeng,

Leung,

Jongman

et al. 2023

Front. Hum. Neurosci.

View full text Add to dashboard Cite

show abstract

“…Future studies are urgently called to clarify the relationship between the McGurk illusion and the other measurements of audiovisual integration. Besides, as working memory seemed not to be an important factor in predicting audiovisual speech perception among adults (Li et al, 2022), this factor was not addressed by the current study. Considering this might not be the case in children, future studies with working memory taken into account are also warranted.…”

Section: Limitations and Future Directionsmentioning

confidence: 66%

The development of audiovisual speech perception in Mandarin‐speaking children: Evidence from the McGurk paradigm

Weng,

Rong,

Peng

2023

Child Development

View full text Add to dashboard Cite

The developmental trajectory of audiovisual speech perception in Mandarin‐speaking children remains understudied. This cross‐sectional study in Mandarin‐speaking 3‐ to 4‐year‐old, 5‐ to 6‐year‐old, 7‐ to 8‐year‐old children, and adults from Xiamen, China (n = 87, 44 males) investigated this issue using the McGurk paradigm with three levels of auditory noise. For the identification of congruent stimuli, 3‐ to 4‐year‐olds underperformed older groups whose performances were comparable. For the perception of the incongruent stimuli, a developmental shift was observed as 3‐ to 4‐year‐olds made significantly more audio‐dominant but fewer audiovisual‐integrated responses to incongruent stimuli than older groups. With increasing auditory noise, the difference between children and adults widened in identifying congruent stimuli but narrowed in perceiving incongruent ones. The findings regarding noise effects agree with the statistically optimal hypothesis.

show abstract