Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues

Wang, Kai; Zeng, Xianghua; Yang, Jianfei; Meng, Debin; Zhang, Kaipeng; Peng, Xiaojiang; Qiao, Yu

doi:10.1145/3242969.3264991

Cited by 38 publications

(27 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Both use classic descriptors for faces as well as upper bodies. In [80], CNNs are used for analysis of faces, scenes, and bodies, and [81] and adds a skeleton analysis to the face and scene analysis, all done with CNNs. Faces, scenes and skeletons are also analysed with CNNs in [82], where on the face-level the CNN output is fed to an LSTM, and where on the scene-level an attention mask is placed over the image.…”

Section: Hybrid Approachesmentioning

confidence: 99%

“…Faces, scenes, and upper bodies [34], [35] Faces, scenes, and bodies/skeletons [80], [81], [82] Faces, scenes, skeletons, [2], [42] and visual attentions/objects Faces and objects [83] Faces, scenes, and places [24] and scene analysis), or fusion of individual emotions in a bottom-up approach.…”

Section: Aspects Description Studiesmentioning

confidence: 99%

“…This is done in [81], [2], [78], [71], and [77]. A combination of individual predictions is also used in [34], [80], and [42], the latter employing majority voting. One or more fully-connected layers for fusion of individual aspects is employed in [1], [69], and [10].…”

Section: Fusion Of Different Analysis Aspectsmentioning

confidence: 99%

“…A Bayesian approach is taken in [20], where individual emotions (resulting from face and speech features combined) influence group emotion through a Bayesian Network. Weighting with a neural touch is done in the final three studies to use decision-level fusion: in [79], [74], and [80], each individual emotion gets assigned a weight that is given by a neural network.…”

Section: Decision-level Fusionmentioning

confidence: 99%

See 3 more Smart Citations

Automatic Emotion Recognition for Groups: A Review

Veltmeijer

Gerritsen

Hindriks

2023

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

This review aims to summarize and describe research on the topic of automatic group emotion recognition. In recent years, the topic of emotion analysis of groups or crowds has gained interest, with studies performing emotion detection in different contexts, using different datasets and modalities (such as images, video, audio, social media messages), and taking different approaches. Articles are included after an innovative search method, including Dense Query Extraction and automatic cross-referencing. Discussed are the types of groups and emotion models considered in automatic emotion recognition research, common datasets for all modalities, general approaches taken, and reported performances. These performances are discussed, followed by an analysis of the application possibilities of the discussed methods. To ensure clear, replicable, and comparable studies, we suggest research should test on multiple, common datasets and report on multiple metrics, when possible. Implementation details and code should be made available where possible. An area of interest for future work is to build systems with more real-world application possibilities, coping with changing group sizes, different emotional subgroups, and changing emotions over time, while having a higher robustness and working with datasets with reduced biases.

show abstract

Section: Hybrid Approachesmentioning

confidence: 99%

Section: Aspects Description Studiesmentioning

confidence: 99%

Section: Fusion Of Different Analysis Aspectsmentioning

confidence: 99%

Section: Decision-level Fusionmentioning

confidence: 99%

See 2 more Smart Citations

Automatic Emotion Recognition for Groups: A Review

Veltmeijer

Gerritsen

Hindriks

2023

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

show abstract

“…Traditionally, the analysis of the affective states of people has been done using individual data: meaning that only one person is present in the data stream. In the recent past, various papers on group emotion recognition, meaning the ability of extracting both grouped and individual affective states from still images has gained some popularity [29,30,17,31,33]. However, the recognition throught time of the affective state of individuals involved in group interactions has not yet been thoroughly addressed.…”

Section: Introductionmentioning

confidence: 99%

ODANet: Online Deep Appearance Network for Identity-Consistent Multi-person Tracking

Delorme

Ban²,

Sarrazin

et al. 2021

Pattern Recognition. ICPR International Workshops and Challenges

View full text Add to dashboard Cite

The analysis of effective states through time in multi-person scenarii is very challenging, because it requires to consistently track all persons over time. This requires a robust visual appearance model capable of re-identifying people already tracked in the past, as well as spotting newcomers. In real-world applications, the appearance of the persons to be tracked is unknown in advance, and therefore on must devise methods that are both discriminative and flexible. Previous work in the literature proposed different tracking methods with fixed appearance models. These models allowed, up to a certain extent, to discriminate between appearance samples of two different people. We propose an online deep appearance network (ODANet), a method able to simultaneously track people and update the appearance model with the newly gathered annotation-less images. Since this task is specially relevant for autonomous systems, we also describe a platform-independent robotic implementation of ODANet. Our experiments show the superiority of the proposed method with respect to the state of the art, and demonstrate the ability of ODANet to adapt to sudden changes in appearance, to integrate new appearances in the tracking system and to provide more identity-consistent tracks.

show abstract

Super-Identity Convolutional Neural Network for Face Hallucination

Zhang

Cheng

et al. 2018

Lecture Notes in Computer Science

Self Cite

129

View full text Add to dashboard Cite

Face hallucination is a generative task to super-resolve the facial image with low resolution while human perception of face heavily relies on identity information. However, previous face hallucination approaches largely ignore facial identity recovery. This paper proposes Super-Identity Convolutional Neural Network (SICNN) to recover identity information for generating faces closed to the real identity. Specifically, we define a super-identity loss to measure the identity difference between a hallucinated face and its corresponding high-resolution face within the hypersphere identity metric space. However, directly using this loss will lead to a Dynamic Domain Divergence problem, which is caused by the large margin between the high-resolution domain and the hallucination domain. To overcome this challenge, we present a domainintegrated training approach by constructing a robust identity metric for faces from these two domains. Extensive experimental evaluations demonstrate that the proposed SICNN achieves superior visual quality over the state-of-the-art methods on a challenging task to super-resolve 12×14 faces with an 8× upscaling factor. In addition, SICNN significantly improves the recognizability of ultra-low-resolution faces.

show abstract

Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues

Cited by 38 publications

References 13 publications

Automatic Emotion Recognition for Groups: A Review

Automatic Emotion Recognition for Groups: A Review

ODANet: Online Deep Appearance Network for Identity-Consistent Multi-person Tracking

Super-Identity Convolutional Neural Network for Face Hallucination

Contact Info

Product

Resources

About