2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015
DOI: 10.1109/cvpr.2015.7298875
|View full text |Cite
|
Sign up to set email alerts
|

Visual recognition by counting instances: A multi-instance cardinality potential kernel

Abstract: Many visual recognition problems can be approached by counting instances. To determine whether an event is present in a long internet video, one could count how many frames seem to contain the activity. Classifying the activity of a group of people can be done by counting the actions of individual people. Encoding these cardinality relationships can reduce sensitivity to clutter, in the form of irrelevant frames or individuals not involved in a group activity. Learned parameters can encode how many instances t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
52
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 75 publications
(52 citation statements)
references
References 25 publications
0
52
0
Order By: Relevance
“…Backbone Group activity SIM [12] AlexNet 81.2% HDTM [24] AlexNet 81.5% Cardinality Kernel [17] None 83.4% SBGAR [32] Inception-v3 86.1% CERN [45] VGG16 87.2% stagNet (GT) [39] VGG16 89.1% stagNet (PRO) [ [3], and outperforms it by about 2% on group activity recognition accuracy, since our model can capture and exploit the relation information among actors. And, we also achieve better performance on individual action recognition task.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Backbone Group activity SIM [12] AlexNet 81.2% HDTM [24] AlexNet 81.5% Cardinality Kernel [17] None 83.4% SBGAR [32] Inception-v3 86.1% CERN [45] VGG16 87.2% stagNet (GT) [39] VGG16 89.1% stagNet (PRO) [ [3], and outperforms it by about 2% on group activity recognition accuracy, since our model can capture and exploit the relation information among actors. And, we also achieve better performance on individual action recognition task.…”
Section: Methodsmentioning
confidence: 99%
“…Group activity recognition has been extensively studied from the research community. The earlier approaches are mostly based on a combination of hand-crafted visual features with probability graphical models [1,31,30,43,6,8,17] or AND-OR grammar models [2,46]. Recently, the wide adoption of deep convolutional neural networks (CNNs) has demonstrated significant performance improvements on group activity recognition [3,24,41,45,12,32,59,23,39].…”
Section: Related Workmentioning
confidence: 99%
“…Accuracy [4] 80.40 [10] 83.40 [15] 79.70 HDTM [13] 81.50 SBGAR [16] 86.10 CERN [21] 87.20 CRM-RGB 83.41 CRM-Flow 85.44 CRM 85.75 labels. Future work can adapt this model to extract the spatial relations in person-object scenarios.…”
Section: Methodsmentioning
confidence: 99%
“…Any person interaction description can then be simply integrated in the model, e.g. [19,35,12] and not only in feature encoding. This led to the introduction of pairwise potentials within energy-based formulations, so that persons involved in the same activity are jointly considered.…”
Section: Related Workmentioning
confidence: 99%