2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2017
DOI: 10.1109/cvprw.2017.245
|View full text |Cite
|
Sign up to set email alerts
|

Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks

Abstract: Automated affective computing in the wild is a challenging task in the field of computer vision. This paper presents three neural network-based methods proposed for the task of facial affect estimation submitted to the First Affect-in-the-Wild challenge. These methods are based on Inception-ResNet modules redesigned specifically for the task of facial affect estimation. These methods are: Shallow Inception-ResNet, Deep Inception-ResNet, and Inception-ResNet with LSTMs. These networks extract facial features in… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(9 citation statements)
references
References 35 publications
0
8
0
1
Order By: Relevance
“…Arousal Valence CNN-M [19] 0.140 0.130 MM-Net [36] 0.088 0.134 FATAUVA [37] 0.095 0.123 DRC-Net [38] 0.094 0.161 PersEmoN (ours) 0.108 0.125 follows: 13) where N t denotes the total number of testing samples, Y P the ground truth, P i the prediction, andȲ P the average value of the ground truth.…”
Section: Methodsmentioning
confidence: 99%
“…Arousal Valence CNN-M [19] 0.140 0.130 MM-Net [36] 0.088 0.134 FATAUVA [37] 0.095 0.123 DRC-Net [38] 0.094 0.161 PersEmoN (ours) 0.108 0.125 follows: 13) where N t denotes the total number of testing samples, Y P the ground truth, P i the prediction, andȲ P the average value of the ground truth.…”
Section: Methodsmentioning
confidence: 99%
“…On the other hand, all traditional layers use a batch normalization layer, and ReLU is applied as an activation function to avoid the vanishing gradient problem. Moreover, based on the proposal by Behzad Hasani et al in [35], we changed all of the "valid" padding to "same" padding in order to reduce the size of the feature map, which means that all output grids will be the same size as their input grids. This model was designed using the Tensorflow libraries on an NVIDIA GeForce TI 1080 GPU with a learning rate of 0.0001.…”
Section: Inception-resnet V2 Model Architecturementioning
confidence: 99%
“…In [24] a Bidirectional Long Short Term Memory Network (BLSTM) is used with hand-crafted features, showing better results than SVR. Hasani et al [12] extract features using Inception module [42]. Combining the it with an LSTM yields better results than using the Inception module only in a per-frame basis.…”
Section: Related Workmentioning
confidence: 99%