UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Facilitating user interaction with data Zolaktaf Zadeh, Zeinab

Abstract

In many domains, users interact with data stored in large, and often structured, data sources. This thesis addresses three phases of user interaction: (1) data exploration, (2) query composition, (3) and query answer analysis. It provides methods to assist in each of these phases, though, of course, no single thesis could be broad enough to cover all possible user interaction in these phases. The first part of the thesis focuses on improving data exploration with recommender systems. Standard recommendation models are biased toward popular items in their suggestions. Our approach is to analyze past interaction logs to estimate user preference for exploration and novelty. We present a generic framework that increases the novelty of recommendations based on each user's novelty preference. The next part of the thesis examines ways of facilitating query composition. We study models that analyze past query logs to model and estimate query properties, such as answer size or error type. By predicting these properties prior to query execution, we can help the user tune and optimize their query. Empirical results show that the data-driven machine learning models can accurately perform several of the prediction tasks. The final part of this thesis studies methods for improving the analysis of large or conflicting query answers. This problem is common in integration contexts where data is segmented across several sources with overlapping and conflicting data values. Depending on which combination of sources and values are used, a simple query can have an overwhelming number of correct and conflicting answers. The approach presented is based on efficiently estimating a query answer distribution. Further, it offers a suite of methods for extracting statistics that convey meaningful information about the answer set. Overall, the solutions developed in this thesis aim to increase the efficiency and decision quality of users. Empirical results on real-world datasets show that the proposed problems and solutions are important steps in the general direction of making information easily accessible to users.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International