2012 IEEE Conference on Computer Vision and Pattern Recognition 2012
DOI: 10.1109/cvpr.2012.6247808
|View full text |Cite
|
Sign up to set email alerts
|

Learning latent temporal structure for complex event detection

Abstract: In this paper, we tackle the problem of understanding the temporal structure of complex events in highly varying videos obtained from the Internet. Towards this goal, we utilize a conditional model trained in a max-margin framework that is able to automatically discover discriminative and interesting segments of video, while simultaneously achieving competitive accuracies on difficult detection and recognition tasks. We introduce latent variables over the frames of a video, and allow our algorithm to discover … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
280
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 309 publications
(281 citation statements)
references
References 28 publications
1
280
0
Order By: Relevance
“…Most prior works in robotics and computer vision formulate the activity recognition problem as Conditional Random Fields (CRFs) [4,5,6,9,10,14,15,20,21]. The CRFs model the environment with nodes and edges.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most prior works in robotics and computer vision formulate the activity recognition problem as Conditional Random Fields (CRFs) [4,5,6,9,10,14,15,20,21]. The CRFs model the environment with nodes and edges.…”
Section: Related Workmentioning
confidence: 99%
“…1). This topic has been widely studied in both robotics communities [3,5,10,14] and other fields [15,19,20]. Most of the work uses a dataset where labels are hard assigned regardless of uncertainty.…”
Section: Introductionmentioning
confidence: 99%
“…However, EM requires an arbitrary initialization at its turn. In [12], the initial states are first assigned with a unique label, and then the number of labels is reduced by agglomerative clustering. In this work, we propose initialization strategies inspired by the assumed semantics for the states:…”
Section: Latent State Initializationmentioning
confidence: 99%
“…Recently, works such as [12], [13] have demonstrated the importance of temporal structure in complex event recognition. In this paper, we propose to combine the use of trained concept detectors with a latent temporal model.…”
Section: Introduction and Related Workmentioning
confidence: 99%
“…(e.g. HMMs [3], Dynamic Bayesian Networks [24], prototype trees [11], AND-OR graphs [23], latent SVM [22], Sum-Product Network [4], and Markov Logic Networks [16]). The key advantage of graphical structures is that they model the dependence of actions by local relationships while allowing for the joint optimization of a global task-dependent objective function.…”
Section: Introductionmentioning
confidence: 99%