2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00756
|View full text |Cite
|
Sign up to set email alerts
|

Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 45 publications
(23 citation statements)
references
References 38 publications
0
23
0
Order By: Relevance
“…3DCrowdNet [12] is a top-down method that proposed to concatenate image features and 2D pose heatmaps to exploit the 2D pose-guided features for better accuracy. We also involved 3D skeleton estimation approaches [7,48,59]: Moon et al [48]'s work that estimates the absolute root position and root-relative 3D skeletons focusing on camera distance. Cheng et al [7]'s work that integrates top-down method and bottom-up methods for estimating better 3D skeletons.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…3DCrowdNet [12] is a top-down method that proposed to concatenate image features and 2D pose heatmaps to exploit the 2D pose-guided features for better accuracy. We also involved 3D skeleton estimation approaches [7,48,59]: Moon et al [48]'s work that estimates the absolute root position and root-relative 3D skeletons focusing on camera distance. Cheng et al [7]'s work that integrates top-down method and bottom-up methods for estimating better 3D skeletons.…”
Section: Methodsmentioning
confidence: 99%
“…Recent multi-person 3D pose regression works [7,54,59,72] tackled a variety of issues such as developing attention-based mechanism dedicated to the 3D pose estimation problem which considers 3D-to-2D projection process [72], combining the top-down and bottom-up networks [7], developing the tracking-based for multi-person [54] and so on. Sárándi et al [59] recently proposed a metric-scale 3D pose estimation method that is robust to truncations.…”
Section: Related Workmentioning
confidence: 99%
“…This paper is based on our conference paper [63]. Unlike our conference version, however, we add test time optimization to handle the gap between training and testing data in Section 3.5, which is critical for our method to process unseen videos.…”
Section: Taskmentioning
confidence: 99%
“…To handle inherent depth ambiguity, Wang et al [39] proposed a novel hierarchical multi-person ordinal relation. Another type of work utilizes temporal information to recover 3D poses from a given video [7,8]. By applying a top-down scheme, the above method either directly regresses the absolute 3D depth from a cropped image, or it computes it based on a prior of the body size, ignoring global image contexts.…”
Section: Related Workmentioning
confidence: 99%