This paper deals with a multimodal annotation scheme dedicated to the study of gestures in interpersonal communication, with particular regard to the role played by multimodal expressions for feedback, turn management and sequencing. The scheme has been developed under the framework of the MUMIN network and tested on the analysis of multimodal behaviour in short video clips in Swedish, Finnish and Danish. The preliminary results obtained in these studies show that the reliability of the categories defined in the scheme is acceptable, and that the scheme as a whole constitutes a versatile analysis tool for the study of multimodal communication behaviour.
Eye gaze is an important means for controlling interaction and coordinating the participants' turns smoothly. We have studied how eye gaze correlates with spoken interaction and especially focused on the combined effect of the speech signal and gazing to predict turn taking possibilities. It is well known that mutual gaze is important in the coordination of turn taking in two-party dialogs, and in this article, we investigate whether this fact also holds for three-party conversations. In group interactions, it may be that different features are used for managing turn taking than in two-party dialogs. We collected casual conversational data and used an eye tracker to systematically observe a participant's gaze in the interactions. By studying the combined effect of speech and gaze on turn taking, we aimed to answer our main questions: How well can eye gaze help in predicting turn taking? What is the role of eye gaze when the speaker holds the turn? Is the role of eye gaze as important in three-party dialogs as in two-party dialogue? We used Support Vector Machines (SVMs) to classify turn taking events with respect to speech and gaze features, so as to estimate how well the features signal a change of the speaker or a continuation of the same speaker. The results confirm the earlier hypothesis that eye gaze significantly helps in predicting the partner's turn taking activity, and we also get supporting evidence for our hypothesis that the speaker is a prominent coordinator of the interaction space. Such a turn taking model could be used in interactive applications to improve the system's conversational performance.
Abstract-We present an approach to enhance the interaction abilities of the Nao humanoid robot by extending its communicative behavior with non-verbal gestures (hand and head movements, and gaze following). A set of nonverbal gestures were identified that Nao could use for enhancing its presentation and turn-management capabilities in conversational interactions. We discuss our approach for modeling and synthesizing gestures on the Nao robot. A scheme for system evaluation that compares the values of users' expectations and actual experiences has been presented. We found that open arm gestures, head movements and gaze following could significantly enhance Nao's ability to be expressive and appear lively, and to engage human users in conversational interactions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.