2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.01101
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

Abstract: Monocular 3D human pose and shape estimation is an inherently ill-posed problem due to depth ambiguities, occlusions, and truncations. Recent probabilistic approaches learn a distribution over plausible 3D human meshes by maximizing the likelihood of the ground-truth pose given an image. We show that this objective function alone is not sufficient to best capture the full distributions. Instead, we propose to additionally supervise the learned distributions by minimizing the distance to distributions encoded i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 92 publications
(17 citation statements)
references
References 32 publications
0
17
0
Order By: Relevance
“…Sharma et al [55] propose to solve the ill-posed 2D-to-3D lifting problem with CVAE [59] and Ordinal Ranking. Wehrbein et al [67] use Normalizing Flows to model the deterministic 3D-2D projection and solve the ambiguous inverse 2D-3D lifting problem. The major difference between our work and multiple-hypotheses 3D pose estimation is that our model is trained without 3D ground truth.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Sharma et al [55] propose to solve the ill-posed 2D-to-3D lifting problem with CVAE [59] and Ordinal Ranking. Wehrbein et al [67] use Normalizing Flows to model the deterministic 3D-2D projection and solve the ambiguous inverse 2D-3D lifting problem. The major difference between our work and multiple-hypotheses 3D pose estimation is that our model is trained without 3D ground truth.…”
Section: Related Workmentioning
confidence: 99%
“…However, generating multi-hypothesis reconstruction for NRSfM is challenging for several reasons: (1) Without 3D ground-truth as supervision, the ambiguous 2D-to-3D mappings cannot be learned using standard generative models like CVAE [59], Conditional GAN [37] or Normalizing Flow [67]. (2) Multiple hypotheses easily suffer from the decomposition ambiguity of NRSfM [18], i.e.…”
Section: Multiple Hypothesis Reconstruction Network -Overviewmentioning
confidence: 99%
“…Recent state-of-the-art methods such as MATAL [14] and UncertainGCN [9] use powerful techniques such as reinforcement learning and graph convolutional networks, however they are computationally very expensive. Uncertainty in 3D human pose [4,44,56] utilizes depth information or stereo images, both of which are not available for general 2D pose estimation. We reserve our discussion on uncertainty in 2D for later.…”
Section: Related Workmentioning
confidence: 99%
“…A normalizing flow (NF) [28] is a generative model that transforms data into tractable distributions. Unlike conventional neural networks, their mapping is bijective, which allows them to train and evaluate in both directions [39]. The forward pass projects data into a latent space to calculate exact likelihoods for the data given the predefined latent distribution.…”
Section: Normalizing Flowsmentioning
confidence: 99%