2018
DOI: 10.1109/lsp.2018.2841649
|View full text |Cite
|
Sign up to set email alerts
|

Ensemble One-Dimensional Convolution Neural Networks for Skeleton-Based Action Recognition

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
27
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 77 publications
(27 citation statements)
references
References 28 publications
0
27
0
Order By: Relevance
“…The CNN-based methods transform 3D-skeleton sequence data from a vector sequence to a pseudo-image and use the CNN network to extract the pseudo-image features [ 18 , 19 , 20 , 21 , 22 , 23 ]. The advantage of the CNN method lies in its powerful feature extraction capabilities, but it cannot handle the spatial and temporal relationships in skeleton information well.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The CNN-based methods transform 3D-skeleton sequence data from a vector sequence to a pseudo-image and use the CNN network to extract the pseudo-image features [ 18 , 19 , 20 , 21 , 22 , 23 ]. The advantage of the CNN method lies in its powerful feature extraction capabilities, but it cannot handle the spatial and temporal relationships in skeleton information well.…”
Section: Related Workmentioning
confidence: 99%
“…With the continuous development of deep learning, many methods extract motion patterns to form a skeleton sequence and use RNNs to model them [ 13 , 14 , 15 , 16 , 17 ]. Other methods based on CNNs transform skeleton data into pseudo images and send them into CNNs for prediction [ 18 , 19 , 20 , 21 , 22 , 23 ]. However, these methods cannot capture the inherent spatial relationship between joints.…”
Section: Introductionmentioning
confidence: 99%
“…Based on LSTM, Song et al [14] proposed a spatio-temporal attention model, which can automatically focus on the discriminative joints and pay different attention weights to each frame. CNN based methods: Some previous works have employed CNN for skeleton based action recognition and achieved great success [15][16][17][18][19][20]. Ke et al [19] represented the sequence as three clips for each channel of the 3D coordinates, which reflects the temporal information of the skeleton sequence and spatial relationship.…”
Section: Related Workmentioning
confidence: 99%
“…Pushpajit et al [12 ] separately trained and combined 5‐CNN streams of different modalities to improve the recognition accuracy. Xu et al [14 ] proposed an Ensemble Neural Network (Ensem‐NN) to combine four different subnets based on designed one‐dimensional ConvNets with residual structure for skeleton‐based human action recognition. Ren et al [13 ] used a rank pooling mechanism to construct dynamic images and applied segment training strategy to obtain multiple ConvNets, and achieved better recognition performance by fusing learned features of different modalities.…”
Section: Related Workmentioning
confidence: 99%