2017
DOI: 10.1109/tpami.2016.2608901
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Pooling for Complex Event Analysis in Untrimmed Videos

Abstract: Pooling plays an important role in generating a discriminative video representation. In this paper, we propose a new semantic pooling approach for challenging event analysis tasks (e.g. event detection, recognition, and recounting) in long untrimmed Internet videos, especially when only a few shots/segments are relevant to the event of interest while many other shots are irrelevant or even misleading. The commonly adopted pooling strategies aggregate the shots indifferently in one way or another, resulting in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
77
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 313 publications
(77 citation statements)
references
References 60 publications
(63 reference statements)
0
77
0
Order By: Relevance
“…Most of the traffic sign detection methods are focused on images. Another focus of study can be to analyze traffic videos for traffic sign detection by leveraging the semantic representations [20,21]. Yet another focus could consider mining the correlations between the features of traffic signs by a semi-supervised feature selection framework [22] in traffic videos.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the traffic sign detection methods are focused on images. Another focus of study can be to analyze traffic videos for traffic sign detection by leveraging the semantic representations [20,21]. Yet another focus could consider mining the correlations between the features of traffic signs by a semi-supervised feature selection framework [22] in traffic videos.…”
Section: Related Workmentioning
confidence: 99%
“…For general poolings, three popular approaches were surveyed, i.e., sum pooling [66][67][68][69][70], average pooling [71][72][73][74][75][76], and max pooling [77][78][79][80]. For particular poolings, another three popular approaches were surveyed, i.e., stochastic pooling [81], semantic pooling [82], and multi-scale pooling [83][84][85][86].…”
Section: Feature Encoding and Pooling Taxonomymentioning
confidence: 99%
“…For complex event detection in long internet videos with few relevant shots, traditional pooling strategies treat usually each shot equally and cannot aggregate the shots based on their relevance with respect to the event of interest [82]. Chang et al [82] proposed a semantic pooling approach to prioritize CNN shot outputs according to their semantic saliencies.…”
Section: Semantic Poolingmentioning
confidence: 99%
See 1 more Smart Citation
“…As a result, cross-modal retrieval attracts increasing attention and plays an important role to describe the content of an image with natural language and conversely retrieve image given textual query Pereira and Vasconcelos (2014); Amir et al (2004); Chang et al (2017a). However, since data in diverse modalities are presented in heterogeneous feature spaces and usually have varying statistical properties, it is a significant challenge to bridge the heterogeneity-gap between multi-modal data Grangier and Bengio (2008); Ranjan et al (2015).…”
Section: Introductionmentioning
confidence: 99%