From tree patterns to generalized tree patterns : on efficient evaluation of XQuery

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

From tree patterns to generalized tree patterns : on efficient evaluation of XQuery Chen, Zhimin

Abstract

XQuery is the de facto standard XML query language, and it is important to have efficient query evaluation techniques available for it. It is a well-known fact that a formal bulk algebra is essential for efficient query evaluation, and the Tree Algebra for XML (TAX), among others, is invented for this purpose. It can be shown in this thesis that a substantial subset of XQuery can be expressed as TAX. An XML document is often modelled as an ordered label tree. A core operation in the evaluation of XQuery is the finding of matches for specified tree patterns against the data tree (or forest), and there has been much work towards algorithms for finding such matches efficiently. Multiple XPath expressions can be evaluated by computing one or more tree pattern matches. However, because of the flexibility of XML data, the efficient evaluation of XQuery queries as a whole is much more than a tree pattern match and combining matchings of multiple tree patterns is not the most efficient evaluation plan for XQuery. In this thesis a structure called generalized tree pattern (GTP) is proposed to concisely represent a whole XQuery expression. Evaluating a query reduces to finding the matches of its GTP, which leads to more efficient evaluation plans. Algorithms are developed to translate an XQuery expression, possibly involving join, quantifiers, grouping, aggregation and nesting, to its GTP, and to generate efficient physical plans for a specified GTP. XML data often conforms to a schema. Relevant constraints from the schema give rise to further opportunities to optimize queries. Algorithms are given in the thesis to automatically infer structural constraints from a given schema and to simplify a GTP given a set of structural constraints. Finally, a detailed set of experiments using the TIMBER XML database system shows that plans via GTPs (with or without schema knowledge) significantly outperform plans based on navigation and straightforward plans obtained directly from the query.

Item Metadata

Title	From tree patterns to generalized tree patterns : on efficient evaluation of XQuery
Creator	Chen, Zhimin
Publisher	University of British Columbia
Date Issued	2003
Description	XQuery is the de facto standard XML query language, and it is important to have efficient query evaluation techniques available for it. It is a well-known fact that a formal bulk algebra is essential for efficient query evaluation, and the Tree Algebra for XML (TAX), among others, is invented for this purpose. It can be shown in this thesis that a substantial subset of XQuery can be expressed as TAX. An XML document is often modelled as an ordered label tree. A core operation in the evaluation of XQuery is the finding of matches for specified tree patterns against the data tree (or forest), and there has been much work towards algorithms for finding such matches efficiently. Multiple XPath expressions can be evaluated by computing one or more tree pattern matches. However, because of the flexibility of XML data, the efficient evaluation of XQuery queries as a whole is much more than a tree pattern match and combining matchings of multiple tree patterns is not the most efficient evaluation plan for XQuery. In this thesis a structure called generalized tree pattern (GTP) is proposed to concisely represent a whole XQuery expression. Evaluating a query reduces to finding the matches of its GTP, which leads to more efficient evaluation plans. Algorithms are developed to translate an XQuery expression, possibly involving join, quantifiers, grouping, aggregation and nesting, to its GTP, and to generate efficient physical plans for a specified GTP. XML data often conforms to a schema. Relevant constraints from the schema give rise to further opportunities to optimize queries. Algorithms are given in the thesis to automatically infer structural constraints from a given schema and to simplify a GTP given a set of structural constraints. Finally, a detailed set of experiments using the TIMBER XML database system shows that plans via GTPs (with or without schema knowledge) significantly outperform plans based on navigation and straightforward plans obtained directly from the query.
Extent	3330189 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-11-21
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051223
URI	http://hdl.handle.net/2429/15489
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2003-05
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_2004-0188.pdf -- 3.18MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

From tree patterns to generalized tree patterns : on efficient evaluation of XQuery Chen, Zhimin

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights