2019
DOI: 10.1145/3323231
|View full text |Cite
|
Sign up to set email alerts
|

Modeling of Human Visual Attention in Multiparty Open-World Dialogues

Abstract: This study proposes, develops, and evaluates methods for modeling the eye-gaze direction and head orientation of a person in multiparty open-world dialogues, as a function of low-level communicative signals generated by his/hers interlocutors. These signals include speech activity, eye-gaze direction, and head orientation, all of which can be estimated in real time during the interaction. By utilizing these signals and novel data representations suitable for the task and context, the developed methods can gene… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(14 citation statements)
references
References 29 publications
1
12
1
Order By: Relevance
“…This technique has the advantage of generating and updating head motions in short time spans. In that sense, the work of Stefanov et al [43] has similarities to our work. They trained forward neural networks with end-to-end features of multiparty interaction.…”
Section: Related Worksupporting
confidence: 83%
See 1 more Smart Citation
“…This technique has the advantage of generating and updating head motions in short time spans. In that sense, the work of Stefanov et al [43] has similarities to our work. They trained forward neural networks with end-to-end features of multiparty interaction.…”
Section: Related Worksupporting
confidence: 83%
“…This implies that mutual attention and speech activities attracted the participants' attention most, while the contents of utterances and head movements were not as distinguishing in various attention combinations of the participants. As described in Section 2, Stefanov et al [43] is the only work that we could find that shares similar objectives at this moment. Although their work cannot be directly compared with our work in the sense of performance due to very different settings of the underlying dataset, purpose, and evaluation metrics, the same tendency in evaluation results can be found:…”
Section: Automatic Prediction Modelmentioning
confidence: 59%
“…As reviewed in other articles (e.g., Admoni and Scassellati, 2017 ; Stefanov et al, 2019 ), research on the relationship between gaze and speech revealed their close coupling in communication settings (Prasov and Chai, 2008 ; Qu and Chai, 2009 ; Andrist et al, 2014 ). In the present study, we investigated the relation between speech (particularly high-level features of it) and gaze direction (i.e., face gaze or aversion) in a dyadic conversation.…”
Section: Introductionmentioning
confidence: 88%
“…Moreover, by using the data gathered experimentally, we trained the simplified versions of two deep networks, the ResNets (He et al, 2016) and VGGNet (Simonyan and Zisserman, 2015) that predict gaze direction based on high-level speech features. Stefanov et al (2019) showed that listener's gaze direction could be modeled from low-level speech features without considering semantic information, and they concluded that different methods are required for modeling speaker's gaze direction. In successful communication, the listener understands what the speaker says the way the speaker desires.…”
Section: The Present Studymentioning
confidence: 99%
See 1 more Smart Citation