Video emotion recognition has attracted increasing attention. Most existing approaches are based on the spatial features extracted from video frames. The context information and their relationships in videos are often ignored. Thus, the performance of existing approaches is restricted. In this study, we propose a sparse spatial-temporal emotion graph convolutional network-based video emotion recognition method (SE-GCN). For the spatial graph, the emotional relationship between any two emotion proposal regions is first calculated and the sparse spatial graph is constructed according to the emotional relationship. For the temporal graph, the emotional information contained in each emotion proposal region is first analyzed and the sparse temporal graph is constructed by using the emotion proposal regions with rich emotional cues. Then, the reasoning features of the emotional relationship are obtained by the spatial-temporal GCN. Finally, the features of the emotion proposal regions and the spatial-temporal relationship features are fused to recognize the video emotion. Extensive experiments are conducted on four challenging benchmark datasets, that is, MHED, HEIV, VideoEmotion-8, and Ekman-6. The experimental results demonstrate that the proposed method achieves state-of-the-art performance.
Previous video action recognition mainly focuses on extracting spatial and temporal features from videos or capturing physical dependencies among joints. The relation between joints is often ignored. Modeling the relation between joints is important for action recognition. Aiming at learning discriminative relation between joints, this paper proposes a joint spatial-temporal reasoning (JSTR) framework to recognize action from videos. For the spatial representation, a joints spatial relation graph is built to capture position relations between joints. For the temporal representation, temporal information of body joints is modeled by the intra-joint temporal relation graph. The spatial reasoning feature and the temporal reasoning feature are fused to recognize action from videos. The effectiveness of our method is demonstrated in three real-world video action recognition datasets. The experiment results display good performance across all of these datasets.
To solve the emotional differences between different regions of the video frame and make use of the interrelationship between different regions, a region dual attention-based video emotion recognition method (RDAM) is proposed. RDAM takes as input video frame sequences and learns a discriminatory video emotion representation that can make full use of the emotional differences of different regions and the interrelationship between regions. Specifically, we construct two parallel attention modules: one is the regional location attention module, which generates a weight value for each feature region to identify the relative importance of different regions. Based on the weight, the emotion feature that can perceive the emotional sensitive region is generated. The other is the regional relationship attention module, which generates a region relation matrix that represents the interrelationship of different regions of a video frame. Based on the region relation matrix, the emotion feature that can perceive interrelationship between different regions is generated. The outputs of these two attention modules are fused to produce the emotional features of video frames. Then, the features of video frame sequences are fused by attention-based fusion network, and the final emotion feature of the video is produced. The experimental results on the video emotion recognition data sets show that the proposed method outperforms the other related works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.