Hierarchical Relational Networks for Group Activity Recognition and Retrieval

Ibrahim, Mostafa S.; Mori, Greg

doi:10.1007/978-3-030-01219-9_44

Cited by 108 publications

(51 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More recently, deep neural network architecture achieves substantial success on group activity understanding due to its high-capacity of multi-level feature representation and integration [1,2,3,4,13,14]. Wang et al [2] proposed a recurrent interactional context model to aggregate person level, group level and scene level interactions.…”

Section: Related Workmentioning

confidence: 99%

“…Deng et al [13] performed structure learning by a unified framework of integrating graphical models and a sequential inference network. Sport Video Analysis: Recently, a considerable amount of efforts have been devoted to team sports analysis, such as basketball [1], volleyball [3,4,5,14,15,16,50,51,52], soccer [19,20], water polo [18], ice hockey [21] etc. Ramanathan et al [1] introduced an attention based BLSTM network to identify the most relevant component (key player) of the corresponding event and recognize basketball events.…”

Section: Related Workmentioning

confidence: 99%

“…Ramanathan et al [1] introduced an attention based BLSTM network to identify the most relevant component (key player) of the corresponding event and recognize basketball events. Ibrahim et al [3] built a relationship graph guided network to infer relational representations of each player and encode scene level representations. Wu et al [46] proposed a domain knowledge based basketball semantic events classification method which represented global and collective motion patterns on different event stages for event recognition.…”

Section: Related Workmentioning

confidence: 99%

“…The main challenge is how to extract discriminative and robust spatio-temporal contextual features in the dynamic scenes. A great deal of attempts utilized various modalities of data to establish the mapping relationship between the group activity and semantic representation such as key components [1], multi-level interaction representations [2], hierarchical relational representations [3] and semantic graph [4].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Fusing motion patterns and key visual information for semantic event recognition in basketball videos

Yang

Wang

et al. 2020

Neurocomputing

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Fusing motion patterns and key visual information for semantic event recognition in basketball videos

Yang

Wang

et al. 2020

Neurocomputing

View full text Add to dashboard Cite

show abstract

“…In recent years, collective activity recognition has made great progress with the development of deep learning [2], [9], [19], [20], [25], [29], [32], [38]. Typically, they first extract person-level features using a convolutional neural network (CNN).…”

Section: Introductionmentioning

confidence: 99%

Group Activity Recognition by Using Effective Multiple Modality Relation Representation With Temporal-Spatial Attention

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Group activity recognition has received a great deal of interest because of its broader applications in sports analysis, autonomous vehicles, CCTV surveillance systems and video summarization systems. Most existing methods typically use appearance features and they seldom consider underlying interaction information. In this work, a technology of novel group activity recognition is proposed based on multi-modal relation representation with temporal-spatial attention. First, we introduce an object relation module, which processes all objects in a scene simultaneously through an interaction between their appearance feature and geometry, thus allowing the modeling of their relations. Second, to extract effective motion features, an optical flow network is fine-tuned by using the action loss as the supervised signal. Then, we propose two types of inference models, opt-GRU and relation-GRU, which are used to encode the object relationship and motion representation effectively, and form the discriminative frame-level feature representation. Finally, an attention-based temporal aggregation layer is proposed to integrate frame-level features with different weights and form effective video-level representations. We have performed extensive experiments on two popular datasets, and both have achieved state-of-the-art performance. The datasets are the Volleyball dataset and the Collective Activity dataset, respectively.

show abstract

Mask Guided Fusion for Group Activity Recognition in Images

Akar

Ikizler-Cinbis

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Hierarchical Relational Networks for Group Activity Recognition and Retrieval

Cited by 108 publications

References 22 publications

Fusing motion patterns and key visual information for semantic event recognition in basketball videos

Fusing motion patterns and key visual information for semantic event recognition in basketball videos

Group Activity Recognition by Using Effective Multiple Modality Relation Representation With Temporal-Spatial Attention

Mask Guided Fusion for Group Activity Recognition in Images

Contact Info

Product

Resources

About