2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.01023
|View full text |Cite
|
Sign up to set email alerts
|

Camera Distance-Aware Top-Down Approach for 3D Multi-Person Pose Estimation From a Single RGB Image

Abstract: Figure 1: Qualitative results of applying our 3D multi-person pose estimation framework to COCO dataset [25] which consists of in-the-wild images. Most of the previous 3D human pose estimation studies mainly focused on the root-relative 3D single-person pose estimation. In this study, we propose a general 3D multi-person pose estimation framework that takes into account all factors including human detection and 3D human root localization. AbstractAlthough significant improvement has been achieved recently in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
386
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 325 publications
(390 citation statements)
references
References 44 publications
(144 reference statements)
4
386
0
Order By: Relevance
“…Table 1 summarizes the quantitative results for pose estimation using the position error metric. RGB only [ 34 ] is the state-of-the-art of 3D human pose estimation using only a single RGB camera. estimates the human pose by minimizing the cost function composed of , , and .…”
Section: Discussionmentioning
confidence: 99%
“…Table 1 summarizes the quantitative results for pose estimation using the position error metric. RGB only [ 34 ] is the state-of-the-art of 3D human pose estimation using only a single RGB camera. estimates the human pose by minimizing the cost function composed of , , and .…”
Section: Discussionmentioning
confidence: 99%
“…In [ 81 ], an insight has been provided into how the body motion and action features from views of two cameras are correlated. A similar study [ 82 ] described the 3D pose map. We will refer to their method to extend this study using multiple cameras.…”
Section: Methodsmentioning
confidence: 99%
“…The network generates human pose proposals, and then classifies the generated human poses into several anchor poses, and refines the poses through regression. Moon et al [ 10 ] proposed a camera-distance-aware top-down method. Their network consists of PoseNet, which predicts root-relative 3D poses, and RootNet, which predicts the absolute 3D pose of the root joint.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, most methods estimate a root-relative 3D pose or shape, and the acquisition of pose or shape in the camera coordinate system requires ground-truth absolute depth information. Moon et al [ 10 ] recently proposed a new method to predict the absolute depth of the root joint to solve this problem and consequently estimate the 3D poses of multiple persons in the camera coordinate system. However, the method of [ 10 ] focuses on pose estimation rather than shape reconstruction, and the entire system cannot be learned in end-to-end fashion due to the separation of pose estimation and absolute depth estimation modules.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation