2015
DOI: 10.1016/j.patrec.2014.10.002
|View full text |Cite
|
Sign up to set email alerts
|

Combining dynamic head pose–gaze mapping with the robot conversational state for attention recognition in human–robot interactions

Abstract: The ability to recognize the Visual Focus of Attention (VFOA, i.e. what or whom a person is looking at) of people is important for robots or conversational agents interacting with multiple people, since it plays a key role in turn-taking, engagement or intention monitoring. As eye gaze estimation is often impossible to achieve, most systems currently rely on head pose as an approximation, creating ambiguities since the same head pose can be used to look at different VFOA targets. To address this challenge, we … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 54 publications
(42 citation statements)
references
References 19 publications
0
42
0
Order By: Relevance
“…Moreover, [31] uses cross-modal information, namely the speaker identity based on the audio track (one of the participants or the robot) as well as the identity of the object of interest. We also note that [31] reports mean FRR values obtained over all the test recordings, instead of an FRR value for each recording. Table IV summarizes a comparison between the average FRR obtained with our method, with [26], and with [31].…”
Section: Results With Rgb Datamentioning
confidence: 99%
See 3 more Smart Citations
“…Moreover, [31] uses cross-modal information, namely the speaker identity based on the audio track (one of the participants or the robot) as well as the identity of the object of interest. We also note that [31] reports mean FRR values obtained over all the test recordings, instead of an FRR value for each recording. Table IV summarizes a comparison between the average FRR obtained with our method, with [26], and with [31].…”
Section: Results With Rgb Datamentioning
confidence: 99%
“…This can be explained by the fact that social interactions, even in different contexts, share a lot of characteristics. We compared our method with three other methods, based on HMMs [26], on input-output HMMs [31], and on a geometric model [4]. The interest of these methods (including ours) resides in the fact that eye detection, unlike many existing gaze estimation methods, is not needed.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Experimental results have revealed considerable differences between statistical methods and non-statistical measurements [23][24][25][26]. The former mainly focus on appearance-based measurements, whereas the latter usually consider geometric relationship cues, such as the deviation of the nose from the mid-line and the deviation between the new head pose and the original state.…”
Section: Non-statistical Approachesmentioning
confidence: 99%