2020
DOI: 10.1109/access.2020.3013917
|View full text |Cite
|
Sign up to set email alerts
|

An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation

Abstract: Human pose estimation from a monocular image has attracted lots of interest due to its huge potential application in many areas. The performance of 2D human pose estimation has been improved a lot with the emergence of deep convolutional neural network. In contrast, the recovery of 3D human pose from an 2D pose is still a challenging problem. Currently, most of the methods try to learn a universal map, which can be applied for all human poses in any viewpoints. However, due to the large variety of human poses … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…Most conversational corpora consist of TV interviews and theatrical plays that have shown themselves to be an appropriate resource of spontaneous conversational expressions, and are significantly more suitable for research in wider ‘discourse concepts’ than any artificially recorded material [ 37 ]. Most methods related to 3D Pose and Shape Estimation from monocular sources refer to (Deep) Convolutional Neural Networks ((D) CNNs) and leverage 2D joint tracking and predict 3D joint poses in the form of stick figures [ 38 , 39 , 40 , 41 , 42 ]. The major challenge with deep learning and similar probabilistic approaches is that the tracking process involves predicting the most probable configuration of the artificial skeleton.…”
Section: Introductionmentioning
confidence: 99%
“…Most conversational corpora consist of TV interviews and theatrical plays that have shown themselves to be an appropriate resource of spontaneous conversational expressions, and are significantly more suitable for research in wider ‘discourse concepts’ than any artificially recorded material [ 37 ]. Most methods related to 3D Pose and Shape Estimation from monocular sources refer to (Deep) Convolutional Neural Networks ((D) CNNs) and leverage 2D joint tracking and predict 3D joint poses in the form of stick figures [ 38 , 39 , 40 , 41 , 42 ]. The major challenge with deep learning and similar probabilistic approaches is that the tracking process involves predicting the most probable configuration of the artificial skeleton.…”
Section: Introductionmentioning
confidence: 99%
“…It is widely considered a fundamental problem in computer vision due to its many downstream applications, including action recognition [6]- [11] and human tracking [12]- [14]. In particular, it is a precursor to 3D human pose estimation [15]- [17], which serves as a potential alternative to invasive marker-based motion capture.…”
Section: Introductionmentioning
confidence: 99%