The Diamond model is a multidimensional model dedicated to XML document warehouses. It considers structured and unstructured data simultaneously. Furthermore, it orders the semantics of documents via a specific semantic dimension linked to conventional dimensions, thus breaking the classical orthogonality rule of dimensions. After giving an overview of their three-phase quasi-automatic approach for the generation of the diamond model, the authors focus on the Diamond-Gen software tool that supports the proposed approach. The authors illustrate the Diamond-Gen functionalities and assess it through an experimental study using a set of 1500 XML documents issued from the PubMed collection.
The data warehouse community has paid particular attention to the document warehouse (DocW) paradigm during the last two decades. However, some important issues related to the semantics are still pending and therefore need a deep research investigation. Indeed, the semantic exploitation of the DocW is not yet mature despite it representing a main concern for decision-makers. This paper aims to enhancing the multidimensional model called Diamond Document Warehouse Model with semantics aspects; in particular, it suggests semantic OLAP (on-line analytical processing) operators for querying the DocW.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.