International Conference on Multimodal Interaction 2022
DOI: 10.1145/3536220.3558806
|View full text |Cite
|
Sign up to set email alerts
|

Automatic facial expressions, gaze direction and head movements generation of a virtual agent

Abstract: In this article, we present two models to jointly and automatically generate the head, facial and gaze movements of a virtual agent from acoustic speech features. Two architectures are explored: a Generative Adversarial Network and an Adversarial Encoder-Decoder. Head movements and gaze orientation are generated as 3D coordinates, while facial expressions are generated using action units based on the facial action coding system. A large corpus of almost 4 hours of videos, involving 89 different speakers is use… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…It relies on the frequency components of the movement's speed profile (i.e., changes of speed over time), represented using the Fourier magnitude spectrum. Closely related to this motion invariant but sharing the data‐driven comparison, let us also mention the study of Delbosc et al [DOS*23] which aimed at generating synchronized and believable facial non‐verbal animations for conversational VH. Authors proposed to evaluate their resulting animation in comparison to ground‐truth data both using a distance metrics based on DTW but also comparing jerk results as a good indicator of motion naturalness.…”
Section: Evaluation Methodsmentioning
confidence: 99%
“…It relies on the frequency components of the movement's speed profile (i.e., changes of speed over time), represented using the Fourier magnitude spectrum. Closely related to this motion invariant but sharing the data‐driven comparison, let us also mention the study of Delbosc et al [DOS*23] which aimed at generating synchronized and believable facial non‐verbal animations for conversational VH. Authors proposed to evaluate their resulting animation in comparison to ground‐truth data both using a distance metrics based on DTW but also comparing jerk results as a good indicator of motion naturalness.…”
Section: Evaluation Methodsmentioning
confidence: 99%
“…An in-depth study of the corpus combining conversational analysis and data mining methods will be performed to construct the computational behavioral model. In particular, we aim to explore methods for the automatic generation of behavior from corpus as proposed in Cherni et al (2022) and Delbosc et al (2022).…”
Section: The Impact Of Attitude On the Behaviormentioning
confidence: 99%
“…However, it is important to acknowledge that facial expressions and head movements are inherently interconnected and synchronized with speech [9]. Habibie et al [25] or Delbosc et al [11] introduced an adversarial approach for the automatic generation of facial expressions and head movements jointly. Drawing inspiration from these works, our research focuses on analyzing facial expressions and head movements in a combined manner, with a representation of facial expressions using explainable features, specifically facial action units.…”
Section: Outputs Of the Modelsmentioning
confidence: 99%