Skip navigation

Search ETDs:

More Like This | More search options

Export: Refworks Refworks | RIS

A Performance Study of XML Query Optimization Techniques

PDF Display Full Text | Download Full Text
13.93 MB PDF file

Degree
PhD, University of Cincinnati, Engineering : Computer Science and Engineering, .
Abstract
As computers and technology continue to become more commonplace and essential to everyday life, more data is captured, stored, and analyzed by a variety of institutions in government, education, and the private sector. As this amount of data grows, so does the need for efficient methodologies and tools used to store, retrieve, and transform the data. A common method used to store this schemaless, semi-structured data is through the Extensible Markup Language, XML. In this way, an XML document is viewed as a database. With this sizable amount of data stored in a common format, one problem is how to efficiently query XML documents. While relational database man- agement systems contain built-in query optimizers, no such framework exists for XML databases. A multitude of document shapes, query shapes, index structures, and query techniques exist for XML databases, but the implications of these choices and their effects on query processing have not been investigated in a common framework. This dissertation identifies a set of representative query techniques, document structures, and query styles for XML databases and provides a com- mon framework for classifying the various query techniques, structures, and styles. We identify two broad classifications of query techniques, native XML and non-native XML, and develop a cost-based model for each technique that models query performance from an execution standpoint. We also develop our own query technique, RDBQuery, as an extension and major enhancement to a previously existing non-native XML query technique that leverages a relational database man- agement system to efficiently process XML queries. To evaluate relative query performance, we compare the techniques for various parameters that impact their performance, including query shape and document shape/size, and the results are presented through a series of graphs. These graphs and their underlying cost models are used to present an optimization framework for XML queries, and this provides the essential foundation in development of an integrated cost-based XML query optimizer.
Subject Headings
Computer science
Keywords
XML; query; optimization
Committee / Advisors
Karen Davis, PhD (Committee Chair)
Raj Bhatnagar, PhD (Committee Member)
John Schlipf, PhD (Committee Member)
Fred Annexstein, PhD (Committee Member)
Hsiang-Li Chiang, PhD (Committee Member)
Pages
279p.

Document number: ucin1258475256
Permalink:

This ETD has been downloaded 318 times (through March 2013)