2021
DOI: 10.3390/s21248309
|View full text |Cite
|
Sign up to set email alerts
|

An Efficient Human Instance-Guided Framework for Video Action Recognition

Abstract: In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 42 publications
0
4
0
Order By: Relevance
“…Therefore, ResNet-50-FPN formed our backbone network. We utilized the same ResNet-50-FPN backbone structure as described previously [43]. As opposed to Mask R-CNN, keypoint RCNN encodes a keypoint (instead of the entire mask) of the detected object.…”
Section: Modelmentioning
confidence: 99%
“…Therefore, ResNet-50-FPN formed our backbone network. We utilized the same ResNet-50-FPN backbone structure as described previously [43]. As opposed to Mask R-CNN, keypoint RCNN encodes a keypoint (instead of the entire mask) of the detected object.…”
Section: Modelmentioning
confidence: 99%
“…Regarding the application of AI in guiding puncture sites during EUS-FNA/B, several limitations and shortcomings need to be considered beyond the previously discussed data standardization issue. One critical limitation is that AI is challenging for dynamic image recognition [91]. EUS images are susceptible to external elements that can cause image jitter and displacement, such as a patient's breathing and heartbeat.…”
Section: The Limitations and Shortages Of Artificial Intelligence In ...mentioning
confidence: 99%
“…Ref. [16] uses human boxes and key points to represent instance-level features, and the action region features of this framework are used as the input of the temporal action head network, which makes the framework more discriminative. The author of [17] proposed a multi-scale feature extraction method used to extract richer feature information.…”
Section: Action Recognitionmentioning
confidence: 99%