2020
DOI: 10.1007/s10044-020-00886-5
|View full text |Cite
|
Sign up to set email alerts
|

Multi-view region-adaptive multi-temporal DMM and RGB action recognition

Abstract: Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weight… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
2

Relationship

3
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 66 publications
0
5
0
Order By: Relevance
“…The DMMs are extended to include multiple appearance information, similar to e.g., [31]. The important features in the 3D depth and appearance streams are then learned by a series of fine-tuned, pre-trained 3D Convolutional Neural Networks (CNN)s. More details on this action recognition technique can be found in [32].…”
Section: Methodsmentioning
confidence: 99%
“…The DMMs are extended to include multiple appearance information, similar to e.g., [31]. The important features in the 3D depth and appearance streams are then learned by a series of fine-tuned, pre-trained 3D Convolutional Neural Networks (CNN)s. More details on this action recognition technique can be found in [32].…”
Section: Methodsmentioning
confidence: 99%
“…Some approaches try to predict pose from RGB data and use it in action prediction. In [231], a framework is provided for hierarchical region-adaptive multi-time resolution depth motion map (RAMDMM) and multi-time resolution RGB action recognition system. The suggested approach presents a feature representation method for RGB-D data that allows multi-view and multi-temporal action recognition.…”
Section: Rgb and Depthmentioning
confidence: 99%
“…In their method feature vectors are mined from MHIs and SHIs by the GLAC feature descriptor. Al-Faris et al [ 37 ] presented the construction of a multi-view region-adaptive multi-resolution-in-time depth motion map (MV-RAMDMM). They trained several scenes and time resolutions of the region-adaptive depth motion maps (RA-DMMs) by multi-stream 3D convolutional neural networks (CNNs).…”
Section: Related Workmentioning
confidence: 99%