This paper proposes a content-based temporal video segmentation system that integrates syntactic (domainindependent) and semantic (domain-dependent) features for automatic management of video data. Temporal video segmentation includes scene change detection and shot classification. The proposed scene change detection method consists of two steps: detection and tracking of semantic objects of interest specified by the user, and an unsupervised method for detection of cuts, and edit effects. Object detection and tracking is achieved using a region matching scheme, where the region of interest is defined by the boundary of the object. A new unsupervised scene change detection method based on two-class clustering is introduced to eliminate the data dependency of threshold selection. The proposed shot classification approach relies on semantic image features and exploits domain-dependent visual properties such as shape, color, and spatial configuration of tracked semantic objects. The system has been applied to segmentation and classification of TV programs collected from different channels. Although the paper focuses on news programs, the method can easily be applied to other TV programs with distinct semantic structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.