2018
DOI: 10.1016/j.ins.2017.12.020
|View full text |Cite
|
Sign up to set email alerts
|

A salient dictionary learning framework for activity video summarization via key-frame extraction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 49 publications
(18 citation statements)
references
References 33 publications
0
18
0
Order By: Relevance
“…L dict exploits h and a common matrix A. It is inspired by the dictionary-ofrepresentatives formulation of unsupervised video key-frame extraction [3]. In this generic framework, given original input video X ∈ R M ×T , the goal is to find an optimal summary matrix S ∈ R M ×C , C << T and a reconstruction coefficient matrix B ∈ R C×T , so that the columns of S constitute a subset of columns of X and the following objective is minimized:…”
Section: Proposed Dictionary Lossmentioning
confidence: 99%
See 2 more Smart Citations
“…L dict exploits h and a common matrix A. It is inspired by the dictionary-ofrepresentatives formulation of unsupervised video key-frame extraction [3]. In this generic framework, given original input video X ∈ R M ×T , the goal is to find an optimal summary matrix S ∈ R M ×C , C << T and a reconstruction coefficient matrix B ∈ R C×T , so that the columns of S constitute a subset of columns of X and the following objective is minimized:…”
Section: Proposed Dictionary Lossmentioning
confidence: 99%
“…Typical key-frame selection criteria were summary diversity and reconstructive ability, the latter being a way of formalizing representativeness as the degree to which the key-frames are jointly able to visually reconstruct all original video frames. Additionally, in various unsupervised approaches, key-frame difference from its temporal neighbours or similar saliency criteria were also employed [3,4,5,6]. In certain cases these algorithms processed raw video frames, but typically they were fed manually crafted image/video features [7,8].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep Learning Methods: To overcome the limitations of conventional methods, more recent works focus on designing deep learning models to tackle the problem of key frame detection. Several supervised and unsupervised models have been proposed for key frame detection in videos which significantly boost the performance of various downstream tasks [1,4,14,17,[49][50][51][52][53][54][55][56]. Yang et al [50] first introduced the bidirectional long short term memory (Bi-LSTM) for automatically extracting the highlights (key frames) from videos.…”
Section: Key Frame Detectionmentioning
confidence: 99%
“…The visual summary of videos has also been widely studied in particular to provide a smart scroll bar when streaming videos. The two most common frameworks are key frame selection [6], [7], [8], [9] and key sub-shot selection [10], [11]. Both frameworks are classically unsupervised and aim at finding clusters that best describe all frames.…”
Section: Introductionmentioning
confidence: 99%