2021
DOI: 10.48550/arxiv.2111.08557
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation

Abstract: In keypoint estimation tasks such as human pose estimation, heatmap-based regression is the dominant approach despite possessing notable drawbacks: heatmaps intrinsically suffer from quantization error and require excessive computation to generate and post-process. Motivated to find a more efficient solution, we propose a new heatmapfree keypoint estimation method in which individual keypoints and sets of spatially related keypoints (i.e., poses) are modeled as objects within a dense single-stage anchorbased d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 43 publications
0
3
0
Order By: Relevance
“…KAPAO is an efficient single-stage multi-person human pose estimation method that models keypoints and poses as objects within a dense anchor-based detection framework. KAPAO simultaneously detects pose objects and keypoint objects and fuses the detections to predict human poses [28].…”
Section: Kapao-based Pose Estimationmentioning
confidence: 99%
“…KAPAO is an efficient single-stage multi-person human pose estimation method that models keypoints and poses as objects within a dense anchor-based detection framework. KAPAO simultaneously detects pose objects and keypoint objects and fuses the detections to predict human poses [28].…”
Section: Kapao-based Pose Estimationmentioning
confidence: 99%
“…At present, a lot of excellent works [2,[9][10][11][12][13][14][15] based on the Transformer [8] or its variants [16][17][18][19][20] have emerged. There are also many excellent works [21][22][23][24][25][26] based on Transformer in the field of human posture estimation. PRTR [21] utilizes the encoder-decoder structure of transformers to perform a regression-based person and keypoint detection.…”
Section: Vision Transformer For Human Pose Estimationmentioning
confidence: 99%
“…We also introduce a variant, BiFenceNet, that outperforms JLJA while using the same 2D skeleton data. This way, coaches and analysts could extract information directly from videos, by training FenceNet on 2D pose data extracted from an offthe-shelf 2D pose estimator [6,32,51], as seen in Fig. 1.…”
Section: Introductionmentioning
confidence: 99%