Yuanling Zhu scite author profile

Abstract. As XML is increasingly being used in Web applications, new technologies need to be investigated for processing XML documents with high performance. Parallelism is a promising solution for structured document processing and data placement is a major factor for system performance improvement in parallel processing. This paper describes an effective XML document data placement strategy. The new strategy is based on a multilevel graph partitioning algorithm with the consideration of the unique features of XML documents and query distributions. A new algorithm, which is based on XML query schemas to derive the weighted graph from the labelled directed graph presentation of XML documents, is also proposed. Performance analysis on the algorithm presented in the paper shows that the new data placement strategy exhibits low workload skew and a high degree of parallelism.Keywords: Data Placement, XML Documents, Graph Partitioning, and Parallel Data Processing. IntroductionAs a new markup language for structured documentation, XML (eXtensible Markup Language) is increasingly being used in Web applications because of its unique features in data representation and exchange. The main advantage of XML is that each XML file can have a semantic schema and makes it possible to define much more meaningful queries than simple, keyword-based retrievals. A recent survey shows that the number of XML business vocabularies has increased from 124 to over 250 in six months [1]. It can be expected that data in XML format would be largely available throughout the Web in the near future. As Web applications are time vulnerable, the increasing size of XML documents and the complexity of evaluating XML queries pose new performance challenges to existing information retrieval technologies. The use of parallelism has shown good scalability in traditional database applications and provides an attractive solution to process structured documents [2]. A large number of XML documents can be distributed onto several processing nodes so that a reasonable query response time can be achieved by processing the related data in parallel.

show abstract

Performance modelling and metrics of database-backed Web sites

Zhu

View full text Add to dashboard Cite

Performance Analysis of Web Database Systems

Zhu

2000

View full text Add to dashboard Cite

Parallel processing XML documents

Zhu¹,

Sun²,

Lin³

et al.

View full text Add to dashboard Cite

As Web applications are time vulnerable, the increasing size of XML documents and the complexity of evaluating XML queries pose new performance challenges to existing information retrieval technologies. This paper introduces a new approach for developing a purpose-built XML data management system to improve the system performance of a Web site with XML support by using parallel data processing techniques. To improve the system performance, we proposed a parallelisation model for XML data processing, where the data storage strategies, data placement methods and query evaluation techniques have been studied. Other related issues are also presented.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yuanling Zhu

Throughput and buffer analysis for GSM General Packet Radio Service (GPRS)

An Effective Data Placement Strategy for XML Documents

Performance modelling and metrics of database-backed Web sites

Performance Analysis of Web Database Systems

Parallel processing XML documents

Contact Info

Product

Resources

About