2008
DOI: 10.1109/icassp.2008.4518639
|View full text |Cite
|
Sign up to set email alerts
|

Gaze-contingent asr for spontaneous, conversational speech: An evaluation

Abstract: There has been little work that attempts to improve the recognition of spontaneous, conversational speech by adding information from a loosely-coupled modality. This study investigated this idea by integrating information from gaze into an ASR system. A probabilistic framework for multimodal recognition was formalised and applied to the specific case of integrating gaze and speech. Gaze-contingent ASR systems were developed from a baseline ASR system by redistributing language model probability mass according … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2009
2009
2021
2021

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 3 publications
0
2
0
Order By: Relevance
“…Eye-tracking can be used to re-assign probabilities of speech recognition hypotheses or to adapt the language model, respectively, by considering human's visual attention leading to significant decrease in word error rate [64]. However, achieved better recognition accuracy with such technique was connected more to the visual field than to the visual focus [65]. Eye-tracking and other non-verbal modalities have been combined to make speech recognition more robust against noise [66].…”
Section: Multimodal Integration Of Different Modalities Related To Human-machine Interactionmentioning
confidence: 99%
“…Eye-tracking can be used to re-assign probabilities of speech recognition hypotheses or to adapt the language model, respectively, by considering human's visual attention leading to significant decrease in word error rate [64]. However, achieved better recognition accuracy with such technique was connected more to the visual field than to the visual focus [65]. Eye-tracking and other non-verbal modalities have been combined to make speech recognition more robust against noise [66].…”
Section: Multimodal Integration Of Different Modalities Related To Human-machine Interactionmentioning
confidence: 99%
“…Eye gaze has been explored in automated language understanding such as speech recognition [4,14], reference resolution [3,13], and recently for word acquisition [10,22]. Given speech paired with eye gaze information and video images, a translation model was used to acquire words by associating acoustic phone sequences with visual representations of objects and actions [22].…”
Section: Related Workmentioning
confidence: 99%