2022
DOI: 10.1109/access.2022.3164730
|View full text |Cite
|
Sign up to set email alerts
|

TEDdet: Temporal Feature Exchange and Difference Network for Online Real-Time Action Detection

Abstract: Localizing and interpreting human actions in videos require understanding the spatial and temporal context of the scenes. Aside from accurate detection, vast sensing scenarios in the real-world also mandate incremental, instantaneous processing of scenes under restricted computational budgets. However, state-of-the-art detectors fail to meet the above criteria. The main challenge lies in their heavy architectural designs and detection pipelines to reason pertinent spatiotemporal information, such as incorporat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 34 publications
0
1
0
Order By: Relevance
“…As a subfield of video action understanding, video action recognition [2] involves only classifying the action categories in video clips, while temporal action detection [3]- [5], [8] focuses on identifying the start and end times of actions within video clips. In contrast, the difficulty of spatio-temporal action detection [6], [7], [38], [39] surpasses these tasks as it requires simultaneously determining the spatial and temporal locations of action instances in the video. Spatio-temporal action detection typically follows a two-step strategy: detection and linking.…”
Section: Introductionmentioning
confidence: 99%
“…As a subfield of video action understanding, video action recognition [2] involves only classifying the action categories in video clips, while temporal action detection [3]- [5], [8] focuses on identifying the start and end times of actions within video clips. In contrast, the difficulty of spatio-temporal action detection [6], [7], [38], [39] surpasses these tasks as it requires simultaneously determining the spatial and temporal locations of action instances in the video. Spatio-temporal action detection typically follows a two-step strategy: detection and linking.…”
Section: Introductionmentioning
confidence: 99%