Visual recognition by counting instances: A multi-instance cardinality potential kernel

Hajimirsadeghi, Hossein; Wang, Yan; Vahdat, Arash; Mori, Greg

doi:10.1109/cvpr.2015.7298875

Cited by 75 publications

(52 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Backbone Group activity SIM [12] AlexNet 81.2% HDTM [24] AlexNet 81.5% Cardinality Kernel [17] None 83.4% SBGAR [32] Inception-v3 86.1% CERN [45] VGG16 87.2% stagNet (GT) [39] VGG16 89.1% stagNet (PRO) [ [3], and outperforms it by about 2% on group activity recognition accuracy, since our model can capture and exploit the relation information among actors. And, we also achieve better performance on individual action recognition task.…”

Section: Methodsmentioning

confidence: 99%

“…Group activity recognition has been extensively studied from the research community. The earlier approaches are mostly based on a combination of hand-crafted visual features with probability graphical models [1,31,30,43,6,8,17] or AND-OR grammar models [2,46]. Recently, the wide adoption of deep convolutional neural networks (CNNs) has demonstrated significant performance improvements on group activity recognition [3,24,41,45,12,32,59,23,39].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Learning Actor Relation Graphs for Group Activity Recognition

Wang

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

241

188

View full text Add to dashboard Cite

Modeling relation between actors is important for recognizing group activity in a multi-person scene. This paper aims at learning discriminative relation between actors efficiently using deep models. To this end, we propose to build a flexible and efficient Actor Relation Graph (ARG) to simultaneously capture the appearance and position relation between actors. Thanks to the Graph Convolutional Network, the connections in ARG could be automatically learned from group activity videos in an end-toend manner, and the inference on ARG could be efficiently performed with standard matrix operations. Furthermore, in practice, we come up with two variants to sparsify ARG for more effective modeling in videos: spatially localized ARG and temporal randomized ARG. We perform extensive experiments on two standard group activity recognition datasets: the Volleyball dataset and the Collective Activity dataset, where state-of-the-art performance is achieved on both datasets. We also visualize the learned actor graphs and relation features, which demonstrate that the proposed ARG is able to capture the discriminative relation information for group activity recognition. 1

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Learning Actor Relation Graphs for Group Activity Recognition

Wang

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

241

188

View full text Add to dashboard Cite

show abstract

“…Accuracy [4] 80.40 [10] 83.40 [15] 79.70 HDTM [13] 81.50 SBGAR [16] 86.10 CERN [21] 87.20 CRM-RGB 83.41 CRM-Flow 85.44 CRM 85.75 labels. Future work can adapt this model to extract the spatial relations in person-object scenarios.…”

Section: Methodsmentioning

confidence: 99%

Convolutional Relational Machine for Group Activity Recognition

Azar

Atigh

Nickabadi

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

119

View full text Add to dashboard Cite

We present an end-to-end deep Convolutional Neural Network called Convolutional Relational Machine (CRM) for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video. It learns to produce an intermediate spatial representation (activity map) based on individual and group activities. A multi-stage refinement component is responsible for decreasing the incorrect predictions in the activity map. Finally, an aggregation component uses the refined information to recognize group activities. Experimental results demonstrate the constructive contribution of the information extracted and represented in the form of the activity map. CRM shows advantages over state-of-the-art models on Volleyball and Collective Activity datasets.

show abstract

“…Any person interaction description can then be simply integrated in the model, e.g. [19,35,12] and not only in feature encoding. This led to the introduction of pairwise potentials within energy-based formulations, so that persons involved in the same activity are jointly considered.…”

Section: Related Workmentioning

confidence: 99%

Recognition of Group Activities in Videos Based on Single-and Two-Person Descriptors

Lathuiliere,

Evangelidis,

Horaud

2017

2017 IEEE Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

Group activity recognition from videos is a very challenging problem that has barely been addressed. We propose an activity recognition method using group context. In order to encode both single-person description and two-person interactions, we learn mappings from highdimensional feature spaces to low-dimensional dictionaries. In particular the proposed two-person descriptor takes into account geometric characteristics of the relative pose and motion between the two persons. Both single-person and two-person representations are then used to define unary and pairwise potentials of an energy function, whose optimization leads to the structured labeling of persons involved in the same activity. An interesting feature of the proposed method is that, unlike the vast majority of existing methods, it is able to recognize multiple distinct group activities occurring simultaneously in a video. The proposed method is evaluated with datasets widely used for group activity recognition, and is compared with several baseline methods.

show abstract

Visual recognition by counting instances: A multi-instance cardinality potential kernel

Cited by 75 publications

References 25 publications

Learning Actor Relation Graphs for Group Activity Recognition

Learning Actor Relation Graphs for Group Activity Recognition

Convolutional Relational Machine for Group Activity Recognition

Recognition of Group Activities in Videos Based on Single-and Two-Person Descriptors

Contact Info

Product

Resources

About