2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015
DOI: 10.1109/cvpr.2015.7298836
|View full text |Cite
|
Sign up to set email alerts
|

Gaze-enabled egocentric video summarization via constrained submodular maximization

Abstract: With the proliferation of wearable cameras, the number of videos of users documenting their personal lives using such devices is rapidly increasing. Since such videos may span hours, there is an important need for mechanisms that represent the information content in a compact form (i.e., shorter videos which are more easily browsable/sharable). Motivated by these applications, this paper focuses on the problem of egocentric video summarization. Such videos are usually continuous with significant camera shake a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
108
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 150 publications
(108 citation statements)
references
References 35 publications
0
108
0
Order By: Relevance
“…Analysis of images captured by egocentric cameras can reveal a lot about the person recording such images, including, intentions, personality, interests, etc. Firstperson gaze prediction is useful in a wide range of applications in health care, education and entertainment, for tasks such as action and event recognition [35], recognition of handled objects [37], discovering important people [16], video re-editing [26], video summarization [45], engagement detection [42], and assistive vision systems [18].…”
Section: Introductionmentioning
confidence: 99%
“…Analysis of images captured by egocentric cameras can reveal a lot about the person recording such images, including, intentions, personality, interests, etc. Firstperson gaze prediction is useful in a wide range of applications in health care, education and entertainment, for tasks such as action and event recognition [35], recognition of handled objects [37], discovering important people [16], video re-editing [26], video summarization [45], engagement detection [42], and assistive vision systems [18].…”
Section: Introductionmentioning
confidence: 99%
“…In the supervised video summarization models, a key factor they are supposed to encompass is the diversity of the selected subset of video shots. This is often imposed by submodularity [10,16] and determinant [1,11,17]. When a video sequence is short, global diversity over the whole sequence seems like a natural choice [11,10].…”
Section: Introductionmentioning
confidence: 99%
“…Such coherency can be performed by selecting sets of consecutive frames [5] or by adding temporal regularization [11]. The balance between interestingness and coherency, can be obtained using pre-segmentation methods [5], submodular optimization [6,19] or recurrent neural networks [22]. Here, we propose a model that jointly incorporates interestingness, coherency, and which is capable of adjusting the summaries based on a user-provided music-track.…”
Section: Related Workmentioning
confidence: 99%