2018
DOI: 10.1007/978-3-030-01234-2_8
|View full text |Cite
|
Sign up to set email alerts
|

Propagating LSTM: 3D Pose Estimation Based on Joint Interdependency

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
129
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 195 publications
(129 citation statements)
references
References 45 publications
0
129
0
Order By: Relevance
“…Structure-aware network architectures have also been used in 3D pose estimation from images [16,29,21,17,31]. [17] and [31] both learn a structured latent space.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Structure-aware network architectures have also been used in 3D pose estimation from images [16,29,21,17,31]. [17] and [31] both learn a structured latent space.…”
Section: Related Workmentioning
confidence: 99%
“…[21] exploit structure only implicitly by encoding the poses into distance matrices which then serve as inputs and outputs of the network. [16] and [29] are closest to our work as they explicitly modify the network to account for skeletal structure, either via the loss function [29], or using a sequence of LSTM cells for each joint in the skeleton [16]. [16] introduces many new layers into the architecture and needs hyper-parameter tuning to be most effective.…”
Section: Related Workmentioning
confidence: 99%
“…Another widely used strategy is to divide the 3D pose estimation task into two decoupled subtasks: 2D pose detection, followed by 3D pose inference from 2D poses. These methods comprise a 2D pose detector and a subsequent optimization [42,41,43] or regression [4,3,16,27,33,13,18,7,11] step to estimate 3D pose . In these methods, the 2D pose and 3D pose estimation stages are separated, making these 3D pose estimators generalize well on outdoor images.…”
Section: Related Workmentioning
confidence: 99%
“…However, the well-known 3d pose datasets [17], [18] contain 3d motion capture (MoCap) data recorded in controlled setup in indoor settings. Hence, 3d supervised learning methods [4], [7], [9], [19] do not generalize well to datasets in the wild where 3d ground-truth is not present.…”
Section: Introductionmentioning
confidence: 99%
“…It is a severely illposed problem which has been formulated in recent literature [3]- [13], [20] as a supervised learning problem given the availability of 3d human motion capture datasets [17], [18]. Most of these works focus on end-to-end 3d pose estimation from single images [3], [5]- [9], [12], [13], [20], while some utilize temporal sequences for estimating 3d pose from video [10], [11]. Our work is related to the problem of 3d pose estimation from a single image, which can be applicable for videos, but without utilizing any temporal information.…”
Section: Introductionmentioning
confidence: 99%