2022
DOI: 10.1007/978-3-031-19778-9_39
|View full text |Cite
|
Sign up to set email alerts
|

GIMO: Gaze-Informed Human Motion Prediction in Context

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 32 publications
(9 citation statements)
references
References 53 publications
0
9
0
Order By: Relevance
“…At the same time, the work presented in Li et al (2023) achieves the best performance on a set of egocentric datasets (Luo et al, 2021;Zheng et al, 2022) captured from outward-looking camera perspective, including their proposed synthetic egocentric dataset. Given that directly matching egocentric video with full-body pose is challenging due to the frequent absence of visible body parts, the authors address the task by introducing an intermediate step of head motion estimation.…”
Section: State-of-the-art Papersmentioning
confidence: 98%
“…At the same time, the work presented in Li et al (2023) achieves the best performance on a set of egocentric datasets (Luo et al, 2021;Zheng et al, 2022) captured from outward-looking camera perspective, including their proposed synthetic egocentric dataset. Given that directly matching egocentric video with full-body pose is challenging due to the frequent absence of visible body parts, the authors address the task by introducing an intermediate step of head motion estimation.…”
Section: State-of-the-art Papersmentioning
confidence: 98%
“…We use the GIMO dataset [Zheng et al 2022] as another test dataset for evaluating the generalization ability of the proposed method on out-of-distribution data.…”
Section: Datasetsmentioning
confidence: 99%
“…For each sequence, we compute the mean of the noncollision scores for all the objects in the scene. In Table 2, we compare the mean non-collision scores on the smoothed PROXD dataset [Zhang et al 2021], which was used during training, and the unseen GIMO dataset [Zheng et al 2022], which also provides SMPL-X parameters for humans interacting with scenes.…”
Section: Contact Object Recoverymentioning
confidence: 99%
See 1 more Smart Citation
“…Then, the output viewport embedding 𝑓 𝑚 − 𝑠 is expected to be aware of the 3D video, which results in the final viewport embedding 𝑓 𝑚 − 𝑔 . Inspired by [35], we handle the gaze embedding in a bidirectional manner, i.e., the viewport embedding 𝑓 𝑚 is also utilized as the query to update the gaze features into 𝑓 𝑔 − 𝑚 . The bidirectionally fused multi-modal features are then assembled into holistic temporal input representations to perform human viewport prediction.…”
Section: Designmentioning
confidence: 99%