2016
DOI: 10.1007/s11042-016-3630-9
|View full text |Cite
|
Sign up to set email alerts
|

Sparse coding-based space-time video representation for action recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 22 publications
0
12
0
Order By: Relevance
“…Weizmann dataset: Similar methodologies have been compared with the proposed methodology for the Weizmann dataset [12,16,17,25,26,31] and the results are shown in Figure 8(a). The proposed method achieved an accuracy of 97.9%.…”
Section: Comparison Of the Proposed Algorithm With Different Methods On Standard Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Weizmann dataset: Similar methodologies have been compared with the proposed methodology for the Weizmann dataset [12,16,17,25,26,31] and the results are shown in Figure 8(a). The proposed method achieved an accuracy of 97.9%.…”
Section: Comparison Of the Proposed Algorithm With Different Methods On Standard Datasetsmentioning
confidence: 99%
“…This methodology does not require background subtraction as they are established on the spatio-temporal points. Methods reported in literature [6][7][8][9][10][11][12][13][14] have used famous bag-of-words models. The main disadvantage of these methodologies is that they give only motion information but no information about the structure.…”
Section: Introductionmentioning
confidence: 99%
“…Tong et al [17] presented a new nonnegative matrix factorization with local constraint and proposed a nonnegative matrix factorization with temporal dependencies constraint; the method can achieve an accuracy of 93.96% on the KTH dataset. Fu et al [18] proposed a method that uses multi-scale volumetric video representation and adaptively selects an optimal space–time scale under which the saliency of a patch is the most significant; the method can achieve an accuracy of 94.33% on the KTH dataset. Kovashka et al [19] proposed a method that first extracts local motion and appearance features, quantizes them to a visual vocabulary, and then forms candidate neighborhoods consisting of the words associated with nearby points and their orientation with respect to the central interest point; the method can achieve an accuracy of 94.53% on the KTH dataset.…”
Section: Related Workmentioning
confidence: 99%
“…In [23] the binarized silhouette is used to find out the trace transform to represent the global feature of action The sequence of a silhouette is represented as the cube video to model the action in [24]. The multiscale volumetric approach for action videos is used in [25,26]. The action is modeled using sparse coding of image sequences in [26].…”
Section: Introductionmentioning
confidence: 99%
“…The multiscale volumetric approach for action videos is used in [25,26]. The action is modeled using sparse coding of image sequences in [26]. The silhouette-based analysis is also used in deep learning-based methodologies [27, 28, and 29].…”
Section: Introductionmentioning
confidence: 99%