2022
DOI: 10.48550/arxiv.2203.13235
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Facial Expression Recognition based on Multi-head Cross Attention Network

Abstract: Facial expression in-the-wild is essential for various interactive computing domains. In this paper, we proposed an extended version of DAN model to address the VA estimation and facial expression challenges introduced in ABAW 2022. Our method produced preliminary results of 0.44 of mean CCC value for the VA estimation task, and 0.33 of the average F1 score for the expression classification task.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…In the 3 th ABAW Competition 2022, Zhang et.al [30] introduced a transformerbased fusion module that extracted the visual features and the dynamic spatial-temporal features and ranked the 1 st place. Jeong et.al [7] adopted an extended DAN model to address the facial expression.…”
Section: Facial Expression Classificationmentioning
confidence: 99%
“…In the 3 th ABAW Competition 2022, Zhang et.al [30] introduced a transformerbased fusion module that extracted the visual features and the dynamic spatial-temporal features and ranked the 1 st place. Jeong et.al [7] adopted an extended DAN model to address the facial expression.…”
Section: Facial Expression Classificationmentioning
confidence: 99%
“…Zhang et al [37] proposed a transformer-based fusion module to fuse multi-modality features from audio, image, and word information. Jeong et al [11] extended the DAN model and achieved 2nd in ABAW3. Xue et al [30] utilized a coarseto-fine cascade network with a temporal smoothing strategy and ranked 3rd in ABAW3.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, it is worth exploring how to use static images for a more generalized usage scenario. Jeong et al [9] use an affinity loss approach, which uses affinity loss to train a feature extractor for images. In addition, they propose a multi-head attention network in a coordinated manner to extract diverse attention for EXPR.…”
Section: Related Workmentioning
confidence: 99%