This paper focuses on the problem of key-frames coding and proposes a new promising approach based on the use of fractals. The summary, made of a set of key-frames selected from a full-length video sequence, is coded by using a 3D fractal scheme. This allows the video presentation tool to expand the video sequence in a "natural" way by using the property of the fractals to reproduce the signal at several resolutions. This feature represents an important novelty of this work with respect to the alternative approaches, which mainly focus on the compression ratio without taking into account the presentation aspect of the video summary. In devising the coding scheme, we have taken care of the computational complexity inherent in fractal coding. Accordingly, the key-frames are first wavelet transformed, and the fractal coding is then applied to each subband to reduce the search range. Experimental results show the effectiveness of the proposed approach.Keywords: Fractals, wavelet, video processing, multimedia. Manuscript received Sept. 19, 2003; revised Oct. 18, 2005. Luigi Atzori (phone: + 39 070 675 5902, email: l.atzori@diee.unica.it ), Daniele D. Giusto (email: ddgiusto@unica.it), and Maurizio Murroni (email: murroni@diee.unica.it) are with the Department of Electrical and Electronic Engineering, University of Cagliari, Italy.
I. IntroductionThe surprising diffusion of multimedia applications (from scientific to commercial, and from informative to recreational) over heterogeneous networks has caused a great deal of interest in the scientific community toward signal processing and data transmission fields. In most of these applications, digital video archives are browsed on distributed networks that are subject to buffer congestions and bandwidth constraints. To enable these services, it is important to develop tools to analyze and describe the video content, handle queries from the end-users, and provide results. These operations require the extraction of the essence of the visual content in a compact form so as to permit a fast browsing of huge multimedia archives. Accordingly, a procedure for automatic video data analysis and indexing has become a requirement for efficient database content searching and management. It is mainly made up of the following tasks [1]: feature extraction, structure analysis, abstraction, and indexing. The first task is aimed at providing the major characteristics of the video (such as color, texture, shape, structure, layout, and motion) that can be converted into semantic concepts. Video structure parsing is the next step in overall video-content analysis and is the process of extracting temporal structural information of video sequences or programs. Video abstraction is the process of creating a presentation of visual information about a landscape or the structure of video, which should be much shorter than the original video. Based on the output of the previous tasks, video indices are built so as to enable a fast browsing of the visual content.Researchers have extensively investi...