No abstract
There has been a lot of research and industrial effort on building XQuery engines with different kinds of XML storage and index models. However, most of these efforts focus on building either an efficient XQuery engine with one kind of XML storage, index, view model in mind or a general XQuery engine without any consideration of the underlying XML storage, index and view model. We need an underlying framework to build an XQuery engine that can work with and provide optimization for different XML storage, index and view models. Besides XQuery, RDBMSs also support SQL/XML, a standard language that integrates XML and relational processing. There are industrial efforts for building hybrid XQuery and SQL/XML engines that support both languages so that users can manage and query both relational and XML data on one platform. However, we need a theoretical framework to optimize both SQL/XML and XQuery languages in one RDBMS. In this paper, we show our industrial work of building a combined XQuery and SQL/XML engine that is able to work and provide optimization for different kinds of XML storage and index models in Oracle XMLDB. This work is based on XML extended relational algebra as the underlying tuple-based logical algebra and incorporates tree and automata based physical algebra into the logical tuple-based algebra so as to provide optimization for different physical XML formulations. This results in logical and physical rewrite techniques to optimize XQuery and SQL/XML over a variety of physical XML storage, index and view models, including schema aware object relational XML storage with relational indexes, binary XML storage with schema agnostic path-value-order key XMLIndex, SQL/XML view over relational data and relational view over XML. Furthermore, we show the approach of leveraging cost based XML physical rewrite strategy to evaluate different physical rewrite plans.
Oracle RDBMS has supported XML data management for more than six years since version 9i. Prior to 11g, textcentric XML documents can be stored as-is in a CLOB column and schema-based data-centric documents can be shredded and stored in object-relational (OR) tables mapped from their XML Schema. However, both storage formats have intrinsic limitations-XML/CLOB has unacceptable query and update performance, and XML/OR requires XML schema. To tackle this problem, Oracle 11g introduces a native Binary XML storage format and a complete stack of data management operations. Binary XML was designed to address a wide range of real application problems encountered in XML data management-schema flexibility, amenability to XML indexes, update performance, schema evolution, just to name a few.In this paper, we introduce the Binary XML storage format based on Oracle SecureFiles System [21]. We propose a lightweight navigational index on top of the storage and an NFA-based navigational algorithm to provide efficient streaming processing. We further optimize query processing by exploiting XML structural and schema information that are collected in database dictionary. We conducted extensive experiments to demonstrate high performance of the native Binary XML in query processing, update, and space consumption.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.