2021
DOI: 10.1109/tcsvt.2020.3004453
|View full text |Cite
|
Sign up to set email alerts
|

Graph-Based CNNs With Self-Supervised Module for 3D Hand Pose Estimation From Monocular RGB

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 32 publications
(17 citation statements)
references
References 32 publications
0
17
0
Order By: Relevance
“…Features learned from these tasks can be transferred to image tasks with good performance. Guo et al designed a self-correction module to co-train networks in previous stages for hand pose estimation [43]. Xu et al proposed a set of pretext tasks specifically designed for sketches [44].…”
Section: A Intra-sample Learningmentioning
confidence: 99%
“…Features learned from these tasks can be transferred to image tasks with good performance. Guo et al designed a self-correction module to co-train networks in previous stages for hand pose estimation [43]. Xu et al proposed a set of pretext tasks specifically designed for sketches [44].…”
Section: A Intra-sample Learningmentioning
confidence: 99%
“…Compared to it, our method is more suitable for video tasks. Guo et al [32] use the 2D heat map as the intermediate supervised signal for 3D hand pose estimation. In comparison with this, our method fuses three self-supervised signals and makes avoiding collapse solution into consideration.…”
Section: A Learning From Video Contentmentioning
confidence: 99%
“…[25] exploits spatial and temporal relationships for 3D human and hand pose estimation tasks. [24] introduces a self-supervised module that uses 2D relationships and 3D geometric knowledge to reduce the gap between 2D and 3D spaces. [27] proposes a UNet-based GCNs to estimate the 3D hand pose and the 6D object pose.…”
Section: Related Workmentioning
confidence: 99%
“…Inspired by the natural graph representation of the hand, recent studies use graph convolution networks (GCN) [22] to model skeletal constraints between joints [23][24][25][26][27]. [25] exploits annotated video frames to enforce spatial and temporal relationships between the joints based on predefined semantic meanings.…”
Section: Introductionmentioning
confidence: 99%