Proceedings of the 19th ACM International Conference on Multimodal Interaction 2017
DOI: 10.1145/3136755.3143017
|View full text |Cite
|
Sign up to set email alerts
|

Group-level emotion recognition using deep models on image scene, faces, and skeletons

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 28 publications
(31 citation statements)
references
References 22 publications
0
31
0
Order By: Relevance
“…Holistic (scene-level) information is shown to be the important component in group-level classification in [10,12,24]. While analyzing the cohesiveness of a group of people, it is essential to understand the environments behind the people, e.g., students in a lecture tend to have a low cohesion level, while a group people standing and protesting at a plaza probably have high cohesiveness.…”
Section: Scene Featuresmentioning
confidence: 99%
“…Holistic (scene-level) information is shown to be the important component in group-level classification in [10,12,24]. While analyzing the cohesiveness of a group of people, it is essential to understand the environments behind the people, e.g., students in a lecture tend to have a low cohesion level, while a group people standing and protesting at a plaza probably have high cohesiveness.…”
Section: Scene Featuresmentioning
confidence: 99%
“…The winning team [35] and the third team [38] utilized two streams of CNN, one for individual emotion recognition and the other for global-level emotion recognition, which are combined to get the final prediction of GER. The second team [14] developed a hybrid network that can utilize global scene features, skeleton features of the group, and also local facial features.…”
Section: Related Workmentioning
confidence: 99%
“…The state-of-the-art approaches for the GER task [14,35,38] employed a combination of top-down and bottom-up approaches. The bottom-up approach entails detecting faces from the image, extracting features and performing expression recognition from each of them, and finally combining individual predictions for group-level prediction.…”
Section: Introductionmentioning
confidence: 99%
“…The authors used a large-margin softmax loss for discriminative learning. In [11], we presented a hybrid network that exploited information from whole images, faces and the skeleton representation of the subjects in the image using CNNs. The problem of group emotion recognition is challenging due to face occlusions, illumination variations, head pose variations, varied indoor and outdoor settings, and faces at different distance from the camera which may lead to low-resolution face images [23].…”
Section: Introductionmentioning
confidence: 99%