2022
DOI: 10.48550/arxiv.2202.09277
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering

Abstract: Spatio-temporal scene-graph approaches to video-based reasoning tasks such as video question-answering (QA) typically construct such graphs for every video frame. Such approaches often ignore the fact that videos are essentially sequences of 2D "views" of events happening in a 3D space, and that the semantics of the 3D scene can thus be carried over from frame to frame. Leveraging this insight, we propose a (2.5+1)D scene graph representation to better capture the spatio-temporal information flows inside the v… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 41 publications
0
1
0
Order By: Relevance
“…3, we present qualitative QA results and compare against the responses produced by two recent methods. More results and visualizations are provided in the extended paper (Cherian et al 2022).…”
Section: Ablation Studiesmentioning
confidence: 99%
“…3, we present qualitative QA results and compare against the responses produced by two recent methods. More results and visualizations are provided in the extended paper (Cherian et al 2022).…”
Section: Ablation Studiesmentioning
confidence: 99%