Abstract-In this paper, we present the current state of the art in semantic data modeling of multimedia data. Semantic conceptualization can be performed at several levels of information granularity, leading to multilevel indexing and searching mechanisms. Various models at different levels of granularity are compared. At the finest level of granularity, multimedia data can be indexed based on image contents, such as identification of objects and faces. At a coarser level of granularity, indexing of multimedia data can be focused on events and episodes, which are higher level abstractions. In light of the above, we also examine modeling and indexing techniques of multimedia documents.
In this paper, we propose a graphical data model for specifying spatio-temporal semantics of video data. The proposed model segments a video clip into subsegments consisting of objects. Each object is detected and recognized, and the relevant information of each object is recorded. The motions of objects are modeled through their relative spatial relationships as time evolves. Based on the semantics provided b y this model, a user can create his/her own object-oriented view of the video database. Using ihe propositional logic, we describe a methodology f o r specifying conceptual queries involving spatio-temporal semantics and expressing views f o r retrieving various video clips. Alternatively, a user can sketch the query, by examplifying the concept. The proposed methodology can be used to specify spatio-temporal concepts at various levels of information granularity.
This paper presents a framework for data modeling and semantic abstraction of image/video data. The framework is based on spatio-temporal information associated with salient objects in an image or in a sequence of video frames and on a set of generalized n-ary operators defined to specify spatial and temporal relationships of objects present in the data. The methodology presented in this paper can manifest itself effectively in conceptualizing events and heterogeneous views an multimedia data as perceaved by individual users. The proposed paradigm induces a multilevel indexing and searching mechanism that models information at various levels of granularity and hence allows processing of content-based queries in real time. We also devise a unified object-oriented interface for users with heterogeneous views to specify queries on the unbiased encoded data. Currently this framework is being developed to realize a highly integrated multimedia database architecture. '
In this paper, we propose a multi-level abstraction mechanism for capturing the spatial and temporal semantics associated with various objects in an input image or in a sequence of video frames. This abstraction can manifest itself effectively in conceptualizing events and views in multimedia data as perceived by individual users. The objective is to provide an efficient mechanism for handling content-based queries, with the minimum amount of processing performed on raw data during query evaluation. We introduce a multilevel architecture for video data management at different levels of abstraction. The architecture facilitates a multi-level indexing/searching mechanism. At the finest level of granularity, video data can be indexed based on mere appearance of objects and faces. For management of information at higher levels of abstractions, an object-oriented paradigm is proposed which is capable of supporting domain specific views.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.