2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.167
|View full text |Cite
|
Sign up to set email alerts
|

3D Action Recognition from Novel Viewpoints

Abstract: We propose a human pose representation model that transfers human poses acquired from different unknown views to a view-invariant high-level space. The model is a deep convolutional neural network and requires a large corpus of multiview training data which is very expensive to acquire. Therefore, we propose a method to generate this data by fitting synthetic 3D human models to real motion capture data and rendering the human poses from numerous viewpoints. While learning the CNN model, we do not use action la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
164
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 174 publications
(165 citation statements)
references
References 57 publications
1
164
0
Order By: Relevance
“…The small variation in the accuracy for different views exhibits the view invariance property of the proposed framework. Whereas, in other state-of-the-arts [18] [1] [34], accuracy of the presented frameworks vary from view to view in the range of 10% that shows these state-of-the-arts are sensitive to different views. The comparison of the recognition accuracy is shown in Table IV, V, and VI for NUCLA, UWA3DII and NTU RGB-D dataset respectively.…”
Section: Ntu Rgb-d Human Activity Datasetmentioning
confidence: 83%
See 2 more Smart Citations
“…The small variation in the accuracy for different views exhibits the view invariance property of the proposed framework. Whereas, in other state-of-the-arts [18] [1] [34], accuracy of the presented frameworks vary from view to view in the range of 10% that shows these state-of-the-arts are sensitive to different views. The comparison of the recognition accuracy is shown in Table IV, V, and VI for NUCLA, UWA3DII and NTU RGB-D dataset respectively.…”
Section: Ntu Rgb-d Human Activity Datasetmentioning
confidence: 83%
“…In this paper shape temporal dynamics (STD) stream is designed to describe the long term shape dynamics of the action with deep convolutional neural network (CNN) structure whose architecture is similar to [18] except that we have connected the last 7 layer with a combination of Bidirectional LSTM and LSTM layers. The architecture of our CNN follows:…”
Section: Model Architecture and Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…The same action viewed from different angles looks quite different. This issue was addressed in [162] using CNN. This method generates the training data by fitting synthetic 3D human model to real motion and renders human poses from different viewpoints.…”
Section: Discriminative/supervised Modelsmentioning
confidence: 99%
“…Rahmani and Mian [35] transferred human poses to a viewinvariant high-level space and recognized action in depth image by using deep convolutional neural network. Their method obtained appropriate results in multi-view datasets.…”
Section: Introductionmentioning
confidence: 99%