2021
DOI: 10.1587/transinf.2020edl0002
|View full text |Cite
|
Sign up to set email alerts
|

Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…The HMDB51 dataset contains a total of 6,849 videos, demonstrating a total of 51 actions, each action contains at least 51 videos, and the resolution of these videos is 320 ∗ 240 [ 27 ]. The data in HMDB51 is mainly from YouTube, Google Video, and other websites.…”
Section: Design Of Lightweight Smart Home System For Middle-aged and ...mentioning
confidence: 99%
“…The HMDB51 dataset contains a total of 6,849 videos, demonstrating a total of 51 actions, each action contains at least 51 videos, and the resolution of these videos is 320 ∗ 240 [ 27 ]. The data in HMDB51 is mainly from YouTube, Google Video, and other websites.…”
Section: Design Of Lightweight Smart Home System For Middle-aged and ...mentioning
confidence: 99%
“…Pose MoTion (PoTion) [10], Pose-Action 3D Machine (PA3D) [34], and Pose-Guided Inflated 3D ConvNet (PI3D) [35] achieve better performance by fusing the pose network and I3D. ST-SAWVLAD [36] proposes a spatio-temporal self-attention weighted VLAD model that follows spatio-temporal selfattention operations to improve network performance. Zheng et al [37] obtained the state-of-the-art performance by fusing global and local detailed information of action.…”
Section: Two-stream Networkmentioning
confidence: 99%