2021 IEEE Winter Conference on Applications of Computer Vision (WACV) 2021
DOI: 10.1109/wacv48630.2021.00233
|View full text |Cite
|
Sign up to set email alerts
|

Exploration of Spatial and Temporal Modeling Alternatives for HOI

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…Qi et al [38] expand prior graphical models in DNNs for videos with learnable graph structures and pass messages through GPNN. Dabral et al [6] analyze the effectiveness of GCNs against Convolutional Networks and Capsule Networks for spatial relation learning. Wang et al [53] propose the STIGPN exploiting the parsed graphs to learn spatiotemporal connection development and discover objects existing in a scene.…”
Section: Hoi Recognition In Videosmentioning
confidence: 99%
“…Qi et al [38] expand prior graphical models in DNNs for videos with learnable graph structures and pass messages through GPNN. Dabral et al [6] analyze the effectiveness of GCNs against Convolutional Networks and Capsule Networks for spatial relation learning. Wang et al [53] propose the STIGPN exploiting the parsed graphs to learn spatiotemporal connection development and discover objects existing in a scene.…”
Section: Hoi Recognition In Videosmentioning
confidence: 99%
“…Qi et al [38] expand prior graphical models in DNNs for videos with learnable graph structures and pass messages through GPNN. Dabral et al [6] analyse the effectiveness of GCNs against Convolutional Networks and Capsule Networks for spatial relation learning. Wang et al [53] propose the STIGPN exploiting the parsed graphs to learn spatiotemporal connection development and discover objects existing in a scene.…”
Section: Hoi Recognition In Videosmentioning
confidence: 99%
“…On the other hand, the model needs to consider human dynamics in the video and the shifting orientations of items in the scene in relation to humans [38]. This makes it difficult to directly extend image-based models to video that exploit the region of interest (ROI) features of human-object union [6]. We propose a novel two-level graph to refine the interactive representations; the first graph models the interdependency within the geometric key points of human and objects, and the second graph models the interdependency between the visual features and the learned geometric representations.…”
Section: Geometric Features Informed Hoi Analysismentioning
confidence: 99%