2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) 2019
DOI: 10.1109/iccvw.2019.00288
|View full text |Cite
|
Sign up to set email alerts
|

Spatio-Temporal Action Graph Networks

Abstract: Events defined by the interaction of objects in a scene are often of critical importance; yet important events may have insufficient labeled examples to train a conventional deep model to generalize to future object appearance. Activity recognition models that represent object interactions explicitly have the potential to learn in a more efficient manner than those that represent scenes with global descriptors. We propose a novel inter-object graph representation for activity recognition based on a disentangle… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
47
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 71 publications
(47 citation statements)
references
References 51 publications
0
47
0
Order By: Relevance
“…Attention for action recognition: There has been a large body of work on incorporating attention in neural networks, primarily focused on language related tasks [44,51]. Attention for videos has been pursued in various forms, including gating or second order pooling [12,30,31,49], guided by human pose or other primitives [4,5,12,13], regiongraph representations [19,48], recurrent models [37] and self-attention [47]. Our model can be thought of as a form of self-attention complementary to these approaches.…”
Section: Related Workmentioning
confidence: 99%
“…Attention for action recognition: There has been a large body of work on incorporating attention in neural networks, primarily focused on language related tasks [44,51]. Attention for videos has been pursued in various forms, including gating or second order pooling [12,30,31,49], guided by human pose or other primitives [4,5,12,13], regiongraph representations [19,48], recurrent models [37] and self-attention [47]. Our model can be thought of as a form of self-attention complementary to these approaches.…”
Section: Related Workmentioning
confidence: 99%
“…Other approaches use physiological signals from the driver [22]. In recent years, deep-learning computer vision techniques have been applied to anomaly detection in first-person driving videos [14,15,23,24]. Other works further attempt to classify the type of anomaly occurring in the video, either offline after the video is fully observed [25][26][27] or in real time [16].…”
Section: Traffic Video Anomaly Detection and Classificationmentioning
confidence: 99%
“…STAG [24] Anomaly detection (supervised) Uses a spatio-temporal action graph (STAG) network to model the spatial and temporal relations among objects.…”
Section: Dsa-rnn [23]mentioning
confidence: 99%
“…The core idea is to enable communication between image regions to build contextualized representations of these regions. Graph networks have been successfully applied to various tasks, from object detection [25] and region classification [7] to human-object interaction [30] and activity recognition [12]. Besides, self-attention models [35] and non-local networks [38] can also be cast as graph networks in a general sense.…”
Section: Graph Network and Contextualized Representationsmentioning
confidence: 99%