Video summarization aims at providing compact representation containing enough information for users to understand the entire content or important events, which serves as the fundamental process in content-based video analysis. This paper presents a novel sport video summarization algorithm by mining consistent field-of-views applied visual and temporal information in a totally unsurprised manner. After videos are broken into shots, a content-based similarity measure is proposed in the shot level to structurally analysis the visually matching cost of original videos. Then modified agglomerative hierarchical clustering is performed with an energy-based function to match the statistical distribution of various views in game videos and a refined distance metric is proposed as similarity measure of two shots. Extended temporal prior is introduced to meet the fact that temporally neighbored shots with similar duration are more likely to be in the same clusters. Experiments on a database of 6 sport genres with over 10251 minutes of videos from different sources achieved an average accuracy of 91.5% and quantitative results are presented to justify each choice made in the design of our algorithm. Our proposed algorithm is applied for the non-linear browsing service of Orangesports by France Telecomm and an android based app has been implemented for smart mobile devices.
This study addresses a non-supervised approach to extract TV programs via repetition based detection of the Inter-Programs (IPs) and audio based segmentation and classification algorithm to analyze the massive raw TV stream. Acoustic and visual information are both adopted for IPs detection so as to avoid missing true-positive. Novel audio fingerprints scheme and shot based indexing algorithm are introduced to guarantee the efficient and superior detection performance. After the TV programs are further segmented into clips, Gaussian Mixture Models (GMMs) are used to classify the clips into three types, namely, pure speech, non-pure speech, and non-speech. Experiments on a test dataset composed of more than 500 hours content-unknown TV streams show that the F-measure of the programs extraction and content analysis achieve 0.986 and 0.887 respectively. The experiments also demonstrate that the proposed algorithm for detecting repeated IPs outperforms the state-of-art approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.