2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00083
|View full text |Cite
|
Sign up to set email alerts
|

Distill Knowledge From NRSfM for Weakly Supervised 3D Pose Learning

Abstract: We propose to learn a 3D pose estimator by distilling knowledge from Non-Rigid Structure from Motion (NRSfM). Our method uses solely 2D landmark annotations. No 3D data, multi-view/temporal footage, or object specific prior is required. This alleviates the data bottleneck, which is one of the major concern for supervised methods. The challenge for using NRSfM as teacher is that they often make poor depth reconstruction when the 2D projections have strong ambiguity. Directly using those wrong depth as hard targ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(29 citation statements)
references
References 48 publications
0
29
0
Order By: Relevance
“…(1) extending its annotation modalities, (2) using weaklysupervised learning [34,38,49] to estimate other modalities, (3) using transfer learning and domain adaptation [1,5,22] to transfer knowledge of other modalities from other data domain to our benchmark.…”
Section: Does Depth Information Help ?mentioning
confidence: 99%
“…(1) extending its annotation modalities, (2) using weaklysupervised learning [34,38,49] to estimate other modalities, (3) using transfer learning and domain adaptation [1,5,22] to transfer knowledge of other modalities from other data domain to our benchmark.…”
Section: Does Depth Information Help ?mentioning
confidence: 99%
“…Knowledge distillation methods have been widely used in many vision tasks, including object detection [30,6,13], line detection [20], semantic segmentation [62,18,34] and human pose estimation [66,40,56,58]. DOPE [58] proposes to distill the 2D and 3D poses from three independent body part expert models to the single whole-body pose detection model.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, generalization to in-thewild applications remains challenging. Weakly-supervised methods were proposed to address this problem using unpaired 2D and 3D annotations [38,40,20], limited available Figure 1. Sample predictions of the proposed weakly-supervised method for in-the-wild videos from 3DPW dataset.…”
Section: Introductionmentioning
confidence: 99%
“…However, obtaining such information for the unsupervised learning task is still an obstacle. To the best of our knowledge, there are only a few works that propose weaklysupervised training schemes without using any 3D annotation [18,13,39,40]. [13] and [39] propose multi-view consistency as a supervision while [18] generates pseudo ground-truth 3D poses using epipolar geometry.…”
Section: Introductionmentioning
confidence: 99%