This chapter describes experimental and modeling work aiming at describing gaze patterns that are mutually exchanged by interlocutors during situated and task-directed face-to-face two-ways interactions. We will show that these gaze patterns (incl. blinking rate) are significantly influenced by the cognitive states of the interlocutors (speaking, listening, thinking, etc.), their respective roles in the conversation (e.g. instruction giver, respondent) as well as their social relationship (e.g. colleague, supervisor).This chapter provides insights into the (micro-)coordination of gaze with other components of attention management as well as methodologies for capturing and modeling behavioral regularities observed in experimental data. A particular emphasis is put on statistical models, which are able to learn behaviors in a data-driven way.We will introduce several statistical models of multimodal behaviors that can be trained on such multimodal signals and generate behaviors given perceptual cues. We will notably Gaze and face-to-face interaction Page 3 compare performances and properties of models which explicitly model the temporal structure of studied signals, and which relate them to internal cognitive states. In particular we study Semi-Hidden Markov Models and Dynamic Bayesian Networks and compare them to classifiers without sequential models (Support Vector Machines and Decision Trees).We will further show that the gaze of conversational agents (virtual talking heads, speaking robots) may have a strong impact on communication efficiency. One of the conclusions we draw from these experiments is that multimodal behavioral models able to generate co-verbal gaze patterns should be designed with great care in order not to increase cognitive load.Experiments involving an impoverished or irrelevant control of the gaze of artificial agents (virtual talking heads and humanoid robots) have demonstrated its negative impact on communication (Garau, Slater, Bee, & Sasse, 2001).
IntroductionThe social relevance of the eyes has been largely investigated. If visually salient objects attract attention, cognitive demands of the visual search easily override contrastive properties i.e. spatiotemporal multimodal salienceof the objects (Henderson, Malcolm, & Schandl, 2009). This is particularly the case for faces (Bindemann, Burton, Hooge, Jenkins, & de Haan, 2005) and notably of faces having direct eye contactsee Senju et al (Senju & Hasegawa, 2005) for a review. Võ et al (Võ, Smith, Mital, & Henderson, 2012) argue for a functional, information-seeking use of gaze allocation during dynamic face viewing.The proper replication of the movement and appearance of the human eye is a challenging issue when building virtual agents or social robots able to engage into believable and smooth communication with human partners (Marschner, Pannasch, Schulz, & Graupner, 2015;Ruhland et al., 2014). We here review some key issues that pave the way towards context-Gaze and face-to-face interaction Page 4 aware gaze models. The chapter is orga...