2018
DOI: 10.1007/978-3-030-01258-8_19
|View full text |Cite
|
Sign up to set email alerts
|

Deep Structure Inference Network for Facial Action Unit Recognition

Abstract: Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for developing general facial expression analysis. In recent years, most efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between Action Units. In this paper, we propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
67
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 114 publications
(67 citation statements)
references
References 34 publications
0
67
0
Order By: Relevance
“…As they are based on handcrafted low level features, the whole algorithm framework can not be performed end-to-end, which greatly restricts the model efficiency and the performance of the method. Recently, Corneanu et al (Corneanu, Madadi, and Escalera 2018) proposed a deep structured inference network (DSIN) for AU recognition which used deep learning to extract image features and structure inference to capture AU relations by passing information between predictions in an explicit way. However, the relationship inference part of DSIN also works as a post-processing step at label level and is isolated with the feature representation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…As they are based on handcrafted low level features, the whole algorithm framework can not be performed end-to-end, which greatly restricts the model efficiency and the performance of the method. Recently, Corneanu et al (Corneanu, Madadi, and Escalera 2018) proposed a deep structured inference network (DSIN) for AU recognition which used deep learning to extract image features and structure inference to capture AU relations by passing information between predictions in an explicit way. However, the relationship inference part of DSIN also works as a post-processing step at label level and is isolated with the feature representation.…”
Section: Related Workmentioning
confidence: 99%
“…We compare our method to alternative methods, including linear SVM(LSVM) (Fan et al 2008), Joint Patch and Multi-label Learning(JPML) (Zhao et al 2015), ConvNet with locally connected layer(LCN) (Taigman et al 2014), Deep Region and Multi-label Learning(DRML) (Zhao, Chu, and Zhang 2016), Region Adaptation, Multi-label Learning(ROI) (Li, Abtahi, and Zhu 2017) and Deep Structure Inference Network(DSIN) (Corneanu, Madadi, and Escalera 2018). Table 4 and table 5 show the result of 12 AUs on BP4D and 8 AUs on DISFA.…”
Section: Comparison With the State Of The Artmentioning
confidence: 99%
“…More recent approaches focus on obtaining local representations using patch learning. Some of these approaches divide the face image into uniform grids (Liu et al, 2014;Zhong et al, 2015;Zhao et al, 2016b) while others define patches around facial parts (Corneanu et al, 2018) or facial landmarks (Zhao et al, 2016a). Among them, Liu et al (2014) divide a face image into non-overlapping patches and categorize them into common and specific patches to describe different expressions.…”
Section: Patch Learningmentioning
confidence: 99%
“…Zhao et al (2016b) use a regional connected convolutional layer that learns specific convolutional filters from sub-areas of the input. Corneanu et al (2018) crop patches containing facial parts, train separate classifiers for each part and fuse the decisions of classifiers using structured learning. Zhao et al (2016a) describe overlapping patches centered at facial landmarks, obtain shallow representations of patches and identify informative patches using a multi-label learning framework.…”
Section: Patch Learningmentioning
confidence: 99%
“…They are gender (male, female), race (Asian, Afroamerican and Caucasian), level of happiness (happy, slightly happy, neutral and other) and makeup (makeup, no makeup, not clear and very subtle makeup). Note that state-of-the-art methods could be used to accurately recognise such attributes from face images (e.g., [6], [15], [20], [21]). However, as the focus of our work is not on improving the recognition accuracy of such attributes, and because of the required amount of data to learn those associated tasks accurately, we decided to import them directly from the adopted dataset [2].…”
Section: A Apparent Age Estimationmentioning
confidence: 99%