“…With the rapid growing of the digital videos, e.g., an estimated 20 hours of videos are uploaded every minute to YouTube website , the video summarization technology [39] becomes much more important especially when content-based indexing and retrieval of video sequences has only seen limited success. Many prominent works have been proposed, such as mosaic-based video summarization [40], keyframes extraction via scene categorization [41]- [44], egocentric video summarization [45], story driven video summarization [46], large-scale video summarization via web image priors [47], joint video and image summarization [48], category-specific video summarization [49], dictionary learning based video summarization [50], consumer video summarization [51], group sparsity video summarization [18], [52] and also l 2,0 norm based dictionary selection for video summarization using SOMP [53]. General speaking, there are two sub-problems in video summarization: 1) keyframe extraction -extracting the most representative images from the underlying video sequence; 2) video skim generation -extracting a collection of video segments from the original video sequence, where each video skim itself is a video clip with a significantly shorter duration.…”