Video Action Understanding

Hutchinson, Matthew; Gadepally, Vijay

doi:10.1109/access.2021.3115476

Cited by 25 publications

(19 citation statements)

References 200 publications

(224 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Action recognition has been an actively studied topic [1] over the last two decades, and various models have been devised to capture the temporal information, such as X3D [18] with 3D CNN, as well as recent models [19] based on Vision Transformer [20]. However, they all require one model per domain and usually each dataset is used to train and validate models separately for performance evaluation.…”

Section: Action Recognition and Domain Adaptationmentioning

confidence: 99%

“…Video recognition tasks [1], especially recognition of human actions, has become important in various real-world applications, and therefore many methods have been proposed. In order to train deep models, it is necessary to collect a variety of videos of human actions in various situations, therefore many datasets have been proposed [2]- [4].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

Omi

Kimata

Tamaki

2022

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

In this paper, we propose a multi-domain learning model for action recognition. The proposed method inserts domain-specific adapters between layers of domain-independent layers of a backbone network. Unlike a multi-head network that switches classification heads only, our model switches not only the heads, but also the adapters for facilitating to learn feature representations universal to multiple domains. Unlike prior works, the proposed method is model-agnostic and doesn't assume model structures unlike prior works. Experimental results on three popular action recognition datasets (HMDB51, UCF101, and Kinetics-400) demonstrate that the proposed method is more effective than a multi-head architecture and more efficient than separately training models for each domain.

show abstract

Section: Action Recognition and Domain Adaptationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

Omi

Kimata

Tamaki

2022

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…In recent years, human action detection has become an active area of research driven by numerous vision applications and industries such as autonomous driving, security monitoring, transport and human-computer interaction systems, etc. The task (also referred to as action localization) concerns creating spatiotemporal action proposals to locate individual actors in space and time from a video, as well as classifying their undergoing action categories [1]. Inherently, action detection imposes more challenges in general when compared to action recognition which seeks only the global label of the video.…”

Section: Introductionmentioning

confidence: 99%

TEDdet: Temporal Feature Exchange and Difference Network for Online Real-Time Action Detection

Liu¹,

Yang²,

Ginhac³

2022

IEEE Access

View full text Add to dashboard Cite

Localizing and interpreting human actions in videos require understanding the spatial and temporal context of the scenes. Aside from accurate detection, vast sensing scenarios in the real-world also mandate incremental, instantaneous processing of scenes under restricted computational budgets. However, state-of-the-art detectors fail to meet the above criteria. The main challenge lies in their heavy architectural designs and detection pipelines to reason pertinent spatiotemporal information, such as incorporating 3D Convolutoinal Neural Networks (CNN) or extracting optical flow. With this insight, we propose a lightweight action tubelet detector coined TEDdet which unifies complementary feature aggregation and motion modeling modules. Specifically, our Temporal Feature Exchange module induces feature interaction by adaptively aggregating 2D CNN features over successive frames. To address actors' location shift in the sequence, our Temporal Feature Difference module accumulates approximated pair-wise motion among target frames as trajectory cues. These modules can be easily integrated with an existing anchor-free detector to cooperatively model action instances' categories, sizes, and movement for precise tubelet generation. TEDdet exploits larger temporal strides to efficiently infer actions in a coarse-to-fine and online manner. Without relying on 3D CNN or optical flow models, our detector demonstrates competitive accuracy at an unprecedented speed (89 FPS) that is more compliant with realistic applications. Codes will be available at https://github.com/alphadadajuju/TEDdet.

show abstract

“…Именно вторая группа алгоритмов является объектом интереса данной работы. К основным задачам понимания видео относят такие категории задач как распознавание действия, предсказание действия и локализация действия [1].…”

Section: Introductionunclassified

“…Следующей задачей понимания видео является предсказание действий, внутри которого также выделяют несколько подзадач: предвосхищение действия и раннее предсказание действий [1]. В задаче предвосхищения действий классификация осуществляется на основании контекстуальных подсказок, в то время как действие еще не начало выполняться.…”

Section: Introductionunclassified

Идентификация Поведенческих Паттернов

Фоминский¹

2022

МАТЕРИАЛЫ IX-й Международной Научной Конференции «МАТЕМАТИЧЕСКОЕ И ПРОГРАММНОЕ ОБЕСПЕЧЕНИЕ ИНФОРМАЦИОННЫХ, ТЕХНИЧЕСКИХ И ЭКОНОМ

View full text Add to dashboard Cite

show abstract

Video Action Understanding

Cited by 25 publications

References 200 publications

Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

TEDdet: Temporal Feature Exchange and Difference Network for Online Real-Time Action Detection

Идентификация Поведенческих Паттернов

Contact Info

Product

Resources

About