2009
DOI: 10.1121/1.3250425
|View full text |Cite
|
Sign up to set email alerts
|

Speech identification in noise: Contribution of temporal, spectral, and visual speech cues

Abstract: This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 48 publications
1
7
0
Order By: Relevance
“…An explanation for the finding that PC1 was the motion component that showed a significant correlation with the percent correct in noise AV scores is that the PC1 movements provided information about the auditory speech signal that could be used to parse this signal from the competing background noise (see Davis and Kim 2004;Kim et al 2009). To determine whether this was the case, a correlation analysis was conducted to examine the degree to which PC1 and the wideband intensity speech envelope (see figure 5) was correlated for in quiet and in noise productions.…”
Section: Resultsmentioning
confidence: 99%
“…An explanation for the finding that PC1 was the motion component that showed a significant correlation with the percent correct in noise AV scores is that the PC1 movements provided information about the auditory speech signal that could be used to parse this signal from the competing background noise (see Davis and Kim 2004;Kim et al 2009). To determine whether this was the case, a correlation analysis was conducted to examine the degree to which PC1 and the wideband intensity speech envelope (see figure 5) was correlated for in quiet and in noise productions.…”
Section: Resultsmentioning
confidence: 99%
“…The perceptual doping effect in the AV1–A2 modality order may have been so strong that it greatly helped the listeners to decode the temporal cues necessary to discriminate vowel duration in the vowel duration discrimination task and to extract phonological cues for vowel identification in the gated vowel identification task, subsequently boosting the participants’ performance on these tasks in the A modality. The addition of V cues might have a stronger effect for consonants than vowels in terms of their AV identification (Kim et al 2009; Moradi et al 2017a), which could explain why the abovementioned effect for vowels was not observed in the gated consonant task. For instance, Moradi et al (2017a) reported that the effect of adding V cues on AV identification and cognitive demand reduction was more evident for consonants than for vowels (i.e., more V saliency for the AV identification of consonants than vowels).…”
Section: Discussionmentioning
confidence: 99%
“…Everyday speech comprehension is multi-faceted: in face-to-face conversation, the listener receives information from the voice and face of a talker, the accompanying gestures of the hands and body and the overall semantic context of the discussion, which can all be used to aid comprehension of the spoken message. Behaviourally, auditory speech comprehension is enhanced by simultaneous presentation of a face or face-like visual cues (Sumby and Pollack, 1954; Grant & Seitz, 2000b; Girin et al, 2001; Kim & Davis, 2004; Bernstein et al, 2004; Schwartz et al, 2004; Helfer & Freyman, 2005; Ross et al, 2007; Thomas & Pilling, 2007; Bishop & Miller, 2009; Kim et al, 2009; Ma et al, 2009; Hazan et al, 2010). Higher-order linguistic information can also benefit intelligibility: words presented in a sentence providing a rich semantic context are more intelligible than words in isolation or in an abstract sentence, particularly when auditory clarity is compromised (Miller & Isard, 1963; Kalikow et al, 1977; Pichora-Fuller et al, 1995; Dubno et al, 2000; Grant & Seitz, 2000a; Stickney & Assmann, 2001, Obleser et al, 2007).…”
Section: Introductionmentioning
confidence: 99%