Key frames extraction using graph modularity clustering for efficient video summarization

Gharbi, H A; Bahroun, Sahbi; Zagrouba, Ezzeddine

doi:10.1109/icassp.2017.7952407

Cited by 23 publications

(12 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this test, we compare our method with some of the state of the art methods using the foreman sequence. The first method is a local feature based keyframe extraction method [10] and the other is a face feature based [18]. A breve description of those two methods is shown below.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

KS‐FQA: Keyframe selection based on face quality assessment for efficient face recognition in video

Bahroun

Abed

Zagrouba

2020

IET Image Processing

Self Cite

View full text Add to dashboard Cite

Video is considered as one of the most useful and important forms of multimedia data, that is usually used in several applications. Despite its importance, video indexing and retrieval becomes a challenging task. In order to reduce the amount of data and keep only relevant frames, keyframe extraction becomes necessary in a content‐based video retrieval (CBVR) system. In this paper, a keyframe extraction method is proposed based on the face image quality for video surveillance systems. Data is reduced by rejecting frames without faces. Then, face images are clustered by identity. After that, a set of candidate frames is selected to be proceeded. The face quality assessment is based on four metrics including pose estimation, sharpness, brightness and resolution, and the frame with the best face quality is considered as a keyframe. Experimental tests were carried on several datasets in order to prove the efficiency of authors' method compared with state‐of‐the‐art approaches.

show abstract

Section: Resultsmentioning

confidence: 99%

“…After that, they detect the interest point in all frames and calculate the repeatability matrix for each shot to extract the most representative frames. In [10], the authors used a windowing rule which consists of selecting one frame for each FPS. In other word, one frame per second.…”

Section: Related Workmentioning

confidence: 99%

KS‐FQA: Keyframe selection based on face quality assessment for efficient face recognition in video

Bahroun

Abed

Zagrouba

2020

IET Image Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…This method utilizes the spectral clustering algorithm to cluster video frames, and then calculates cluster centres by the k-means algorithm to extract key frames. Gharbi et al [42] propose a method using the graph modularization clustering principle to select key frames, which can retain the salient content of the video. The limitations of clustering algorithms are that key frame extraction is too dependent on clustering results, and the extraction results are not sequential.…”

Section: Clustering-based Methodsmentioning

confidence: 99%

“…Gharbi et al. [42] propose a method using the graph modularization clustering principle to select key frames, which can retain the salient content of the video. The limitations of clustering algorithms are that key frame extraction is too dependent on clustering results, and the extraction results are not sequential.…”

Section: Related Workmentioning

confidence: 99%

Video key frame extraction based on scale and direction analysis

Dong¹,

Zhang

et al. 2022

The Journal of Engineering

View full text Add to dashboard Cite

Currently, efficient and accurate key frame extraction for massive videos remains a challenge, especially in surveillance applications. To overcome the inaccuracy of traditional key frame extraction, this paper proposes a method for key frame extraction via analyzing scale and direction of motion trajectory on spatiotemporal slice (MTSS). The proposed method employs both local and global motion state changes of object as the metric to extract key frames and meanwhile gives a strategy which integrates coarse extraction with partial fine re‐extraction of spatiotemporal slices to capture these state changes. Firstly, coarse extraction of spatiotemporal slices is employed to detect motion segments. Then partial fine re‐extraction is performed on the motion segments to obtain MTSS. Finally, the frames at the inflexions of scale and direction of MTSS are extracted as key frames. The experimental results have demonstrated that the proposed method outperforms existing state‐of‐the‐art methods in terms of accuracy for various videos while requiring a comparable or even less computation time.

show abstract

“…For general video summarization, there are many methods that use a set of automatically extracted key frames to represent the main content of the video [1,2]. ese methods seek to nd important scenes, objects, colors, and moving objects in videos and usually follow three steps, namely, video feature extraction, frame image clustering [3,4] or classi cation, and key frame selection. However, these methods do not scale well to lecture videos.…”

Section: Introductionmentioning

confidence: 99%

Lecture Video Automatic Summarization System Based on DBNet and Kalman Filtering

Sun

Tian

2022

Mathematical Problems in Engineering

View full text Add to dashboard Cite

Video summarization for educational scenarios aims to extract and locate the most meaningful frames from the original video based on the main contents of the lecture video. Aiming at the defect of existing computer vision-based lecture video summarization methods that tend to target specific scenes, a summarization method based on content detection and tracking is proposed. Firstly, DBNet is introduced to detect the contents such as text and mathematical formulas in the static frames of these videos, which is combined with the convolutional block attention module (CBAM) to improve the detection precision. Then, frame-by-frame data association of content instances is performed using Kalman filtering, the Hungarian algorithm, and appearance feature vectors to build a tracker. Finally, video segmentation and key frame location extraction are performed according to the content instance lifelines and content deletion events constructed by the tracker, and the extracted key frame groups are used as the final video summary result. Experimenting on a variety of scenarios of lecture video, the average precision of content detection is 89.1%; the average recall of summary results is 92.1%.

show abstract

Key frames extraction using graph modularity clustering for efficient video summarization

Cited by 23 publications

References 13 publications

KS‐FQA: Keyframe selection based on face quality assessment for efficient face recognition in video

KS‐FQA: Keyframe selection based on face quality assessment for efficient face recognition in video

Video key frame extraction based on scale and direction analysis

Lecture Video Automatic Summarization System Based on DBNet and Kalman Filtering

Contact Info

Product

Resources

About