UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Provenance in relational databases : usability and applications AlOmeir, Omar

Abstract

Data provenance is any information about the origin of a piece of data and the process that led to its creation. Most database provenance work has focused on creating models and semantics to query and generate this provenance information. While comprehensive, provenance information remains large and overwhelming, which can make it hard for data provenance systems to support data exploration or any meaningful applications. This thesis is focused on facilitating the use of database provenance through visual interfaces, summarization techniques, and curation techniques for real world applications. In the first part, we present visualization techniques for provenance information in relational databases. Our visualizations address every part of provenance information to facilitate user exploration. Through a user experiment, we show that our approach improves the accuracy and efficiency of performing exploration tasks. The next part addresses the challenge of volume of provenance information. Specifically, in the case of aggregation queries. The volume increases with the size of the database and creates a "needle in a haystack problem". We present novel summarization techniques that build on existing summarization literature. Our techniques work to support exploration for users who are not familiar with the data or its provenance. The final part shows our use of our summarization techniques to address the problem of refining aggregate queries. Aggregate queries pose a challenge in that they present ambiguous results to inexperienced users. Query refinement can help users realize their query errors and help them fix them. Through user experiment, we present evidence of the usefulness, and usability of our methods. Overall, the goal of this thesis is to facilitate the use of provenance information in relational databases. Through the use of novel techniques and user-centric evaluation, we present novel solutions and user interaction methods to enable new applications in this domain.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International