People can understand how human interaction unfolds and can pinpoint social attitudes such as showing interest or social engagement with a conversational partner. However, summarising this with a set of rules is difficult, as our judgement is sometimes subtle and subconscious. Hence, it is challenging to program Non-Player Characters (NPCs) to react towards social signals appropriately, which is important for immersive narrative games in Virtual Reality (VR). We collaborated with two game studios to develop an immersive machine learning (ML) pipeline for detecting social engagement. We collected data from participants-NPC interaction in VR, which was then annotated in the same immersive environment. Game design is a creative process and it is vital to respect designer’s creative vision and judgement. We therefore view annotation as a key part of the creative process. We trained a reinforcement learning algorithm (PPO) with imitation learning rewards using raw data (e.g. head position) and socially meaningful derived data (e.g. proxemics); we compared different ML configurations including pre-training and a temporal memory (LSTM). The pre-training and LSTM configuration using derived data performed the best (84% F1-score, 83% accuracy). The models using raw data did not generalise. Overall, this work introduces an immersive ML pipeline for detecting social engagement and demonstrates how creatives could use ML and VR to expand their ability to design more engaging experiences. Given the pipeline’s results for social engagement detection, we generalise it for detecting human-defined social attitudes.
Figure 1: Example of realistic (a,c) and cartoon (b,d) avatar upper bodies. The main menu controlling the session (e). A meeting between two participants in realistic avatars (f) and three in cartoon avatars (g) with the adjustable blue table marking the centre.
Nonverbal cues have multiple roles in social encounters, with gaze behaviour facilitating interactions and conversational flow. In this work, we explore the conversation dynamics in dyadic settings in a free-flow discussion. Using automatic analysis (rather than manual labelling), we investigate how the gaze behaviour of one person is related to how much the other person changes their gaze (frequency in gaze change) and what their gaze target is (direct or avert gaze). Our results show that when one person is looked at they change their gaze direction with a higher frequency compared to when they are not looked at. They also tend to maintain a direct gaze to the other person when they are not looked at.
CCS CONCEPTS• Human-centered computing → HCI theory, concepts and models; • Computing methodologies → Intelligent agents.
Figure 1: Pipeline for detecting human-defined social attitudes, including immersive data collection (user interaction (A) and expert annotating (B)) for training the machine learning model. This takes place by pre-training the model, creating Generative Adversarial Imitation Learning (GAIL) rewards for the reinforcement learning algorithm Proximal Policy Optimisation (PPO) that also uses a temporal memory called Long Short-Term Memory (LSTM) algorithm (C). This process exports a trained ML model (D). In a user-VC interaction (E), the trained model (F) detects in real time the human-defined social attitude (G) which could be used in different scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.