2018
DOI: 10.1109/tmm.2017.2741423
|View full text |Cite
|
Sign up to set email alerts
|

F-DES: Fast and Deep Event Summarization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
30
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 128 publications
(30 citation statements)
references
References 39 publications
0
30
0
Order By: Relevance
“…Jain et al propose a method that embeds structure into a deep model [31] to incorporate knowledge with deep models for activity recognition. Other deep approaches proposed for activity recognition are [32], [33]. A group of research has been conducted to use objects, their affordances, and states in video for action recognition [34], [35], [36], [37], [38], [39].…”
Section: B Video Understandingmentioning
confidence: 99%
See 1 more Smart Citation
“…Jain et al propose a method that embeds structure into a deep model [31] to incorporate knowledge with deep models for activity recognition. Other deep approaches proposed for activity recognition are [32], [33]. A group of research has been conducted to use objects, their affordances, and states in video for action recognition [34], [35], [36], [37], [38], [39].…”
Section: B Video Understandingmentioning
confidence: 99%
“…Various researchers have proposed methods for handling multi-camera scenarios. Event summarization in multi-view videos using a deep learning approach [40], detection and summarization of an event in multi-view surveillance videos by applying boosting [41], and a machine learning ensemble method [42] are instances of research in the area of multi-view video understanding. We have not addressed this aspect of video understanding in this work.…”
Section: B Video Understandingmentioning
confidence: 99%
“…Krishan Kumar et al [1] proposed FASTA approach which is local alignment based method to summarize the events in Multiview videos. Convolutional Neural Network (CNN) is trained with RGB input images with multiple multi-channel filters .Initially N frames of equal length of a single view is fed into these CNN to extract the visual features and object detection .…”
Section: J Fastamentioning
confidence: 99%
“…[8] outperforms well with [11] ¤ Rameswar Panda et al [3] proposed unsupervised framework via joint embedding and sparse representative selection to resolve inter-view dependencies among multi-view videos. They extract CNN visual features from the video frames .It outperforms well with the [11] and [8] ¤ Krishan Kumar et al [1] method extract the CNN features from the input video frame and nucleotide sequence is created based on the cosine similarity of CNN feature between the frames. FASTA algorithm is used to remove the inter-view redundancy and the correlations among multiple views is cptured using an optimized alignment approach.…”
Section: Inference From the Surveymentioning
confidence: 99%
See 1 more Smart Citation