Proceedings of the 27th ACM International Conference on Multimedia 2019
DOI: 10.1145/3343031.3351040
|View full text |Cite
|
Sign up to set email alerts
|

Explainable Video Action Reasoning via Prior Knowledge and State Transitions

Abstract: Human action analysis and understanding in videos is an important and challenging task. Although substantial progress has been made in past years, the explainability of existing methods is still limited. In this work, we propose a novel action reasoning framework that uses prior knowledge to explain semantic-level observations of video state changes. Our method takes advantage of both classical reasoning and modern deep learning approaches. Specifically, prior knowledge is defined as the information of a targe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
3

Relationship

3
7

Authors

Journals

citations
Cited by 48 publications
(17 citation statements)
references
References 79 publications
0
17
0
Order By: Relevance
“…Inspired by the convincing performance and high interpretability of graph convolutional networks (GCN) [18,46,60], several works [53,54,63,15,59] were proposed to represent the fine-grained temporal relationships within videos by using GCN in such a supervised learning fashion with large amounts of labeled data. Unfortunately, due to the lack of principles or guidelines to explore the intrinsic temporal knowledge of unlabeled video data, it is quite challenging to utilize GCN for self-supervised video representation learning.…”
Section: Introductionmentioning
confidence: 99%
“…Inspired by the convincing performance and high interpretability of graph convolutional networks (GCN) [18,46,60], several works [53,54,63,15,59] were proposed to represent the fine-grained temporal relationships within videos by using GCN in such a supervised learning fashion with large amounts of labeled data. Unfortunately, due to the lack of principles or guidelines to explore the intrinsic temporal knowledge of unlabeled video data, it is quite challenging to utilize GCN for self-supervised video representation learning.…”
Section: Introductionmentioning
confidence: 99%
“…Understanding these movements is vital for an artificial intelligent agent to comprehend and interact with the ever-changing world. Studies on social behavior analysis [8,9], action recognition [63,58], and video summarizing [59] have also acknowledged the importance of movement.…”
Section: Introductionmentioning
confidence: 99%
“…Besides, the logical rules of different RPM problems are often different, which makes this task more challenging. With the success of deep learning in image [7], [8], [9] and videos [10], [11], [12], [13], [14], recent approaches [3], [4], [15], [16], [17], [18] try to solve RPM problems with deep neural networks. Similar to the image and video classification, these methods also address abstract reasoning on RPM problems as a classification task.…”
Section: Introductionmentioning
confidence: 99%