Intensity-Aware Loss for Dynamic Facial Expression Recognition in the Wild

Li, Hanting; Niu, Hongjing; Zhu, Zhaoqing; Zhang, Feng

doi:10.1609/aaai.v37i1.25077

Cited by 20 publications

(6 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Its goal is to classify a facial video clip, rather than a still image, into one of the basic emotions. The field of DFER has attracted considerable attention from researchers [16][17][18][19][20][21][22]. These studies share a common goal of addressing challenges within environmental scenarios, such as occlusion, pose variation, and noisy frames.…”

Section: Related Workmentioning

confidence: 99%

“…This method incorporates emotion-driven loss functions to enhance recognition accuracy, but it may lack robustness in handling diverse environmental scenarios. Li et al (2023) [22] contributed to intensity-adaptive loss for dynamic facial expression recognition by integrating a global attentional bias (GCA) block and intensity-adaptive loss (IAL) to handle different expression intensities. While effective in addressing intensity variations, this approach may require additional computational overhead.…”

Section: Related Workmentioning

confidence: 99%

“…Wang et al (2023) [20] reimagined the learning paradigm for DFER by treating it as a weakly supervised problem and introduced the multi-3D dynamic facial expression learning (M3DFEL) framework with multi-instance learning (MIL). Additionally, variations in loss functions are investigated with Li et al ( 2023) [22], introducing Intensity-Aware Loss to distinguish samples with low expression intensity. While the Intensity-Aware Loss effectively handles expression intensity variations, it may introduce additional computational overhead during training, potentially limiting scalability to larger datasets.…”

Section: Related Workmentioning

confidence: 99%

“…This method emphasizes the efficacy of proposed loss functions and fusion parameters. Addressing the nuances of expression intensity, the Intensity-Aware Loss for Dynamic Facial Expression Recognition in the Wild method [22] employs a GCA Block, Dynamic-Static Fusion Module, and Temporal Transformer for feature extraction. Trained on PyTorch-GPU and Tesla V100 GPUs, it achieves performance on dynamic facial expression recognition tasks.…”

Section: Related Workmentioning

confidence: 99%

“…In this study, UAR and WAR are employed as primary evaluation metrics, aligning with established practices in the field of dynamic facial expression recognition. These metrics are widely used in previous studies for their effectiveness in evaluating model performance across various domains, including facial expression recognition [17][18][19][20]22,27]. UAR, computed as the average recall across all classes, provides an unbiased assessment of the model's ability to accurately classify facial expressions without favoring any specific class.…”

Section: Experimental Protocolmentioning

confidence: 99%

See 4 more Smart Citations

SlowR50-SA: A Self-Attention Enhanced Dynamic Facial Expression Recognition Model for Tactile Internet Applications

Neshov,

Christoff,

Sechkova

et al. 2024

Electronics

View full text Add to dashboard Cite

Emotion recognition from facial expressions is a challenging task due to the subtle and nuanced nature of facial expressions. Within the framework of Tactile Internet (TI), the integration of this technology has the capacity to completely transform real-time user interactions, by delivering customized emotional input. The influence of this technology is far-reaching, as it may be used in immersive virtual reality interactions and remote tele-care applications to identify emotional states in patients. In this paper, a novel emotion recognition algorithm is presented that integrates a Self-Attention (SA) module into the SlowR50 backbone (SlowR50-SA). The experiments on the DFEW and FERV39K datasets demonstrate that the proposed model achieves good performance in terms of both Unweighted Average Recall (UAR) and Weighted Average Recall (WAR) metrics, achieving a UAR (WAR) of 57.09% (69.87%) on the DFEW dataset, and UAR (WAR) of 39.48% (49.34%) on the FERV39K dataset. Notably, SlowR50-SA operates with only eight frames of input at low temporal resolution, highlighting its efficiency. Furthermore, the algorithm has the potential to be integrated into Tactile Internet applications, where it can be used to enhance the user experience by providing real-time emotion feedback. SlowR50-SA can also be used to enhance virtual reality experiences by providing personalized haptic feedback based on the user’s emotional state. It can also be used in remote tele-care applications to detect signs of stress, anxiety, or depression in patients.

show abstract