Managing updates and transformations in data sharing systems

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Managing updates and transformations in data sharing systems Thrastarson, Arni Mar

Abstract

Dealing with dirty data is an expensive and time consuming task. Estimates suggest that up to 80% of the total cost of large data projects is spent on data cleaning alone. This work is often done manually by domain experts in data applications, working with data copies and limited database access. We propose a new system of update propagation to manage data cleaning transformations in such data sharing scenarios. By spreading the changes made by one user to all users working with the same data, we hope to reduce repeated manual labour and improve overall data quality. We describe a modular system design, drawing from different research areas of data management, and highlight system requirements and challenges for implementation. Our goal is not to achieve full synchronization, but to propagate updates that individual users consider valuable to their operation.

Item Metadata

Title	Managing updates and transformations in data sharing systems
Creator	Thrastarson, Arni Mar
Publisher	University of British Columbia
Date Issued	2014
Description	Dealing with dirty data is an expensive and time consuming task. Estimates suggest that up to 80% of the total cost of large data projects is spent on data cleaning alone. This work is often done manually by domain experts in data applications, working with data copies and limited database access. We propose a new system of update propagation to manage data cleaning transformations in such data sharing scenarios. By spreading the changes made by one user to all users working with the same data, we hope to reduce repeated manual labour and improve overall data quality. We describe a modular system design, drawing from different research areas of data management, and highlight system requirements and challenges for implementation. Our goal is not to achieve full synchronization, but to propagate updates that individual users consider valuable to their operation.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2014-10-09
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivs 2.5 Canada
DOI	10.14288/1.0135578
URI	http://hdl.handle.net/2429/50729
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2014-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Managing updates and transformations in data sharing systems Thrastarson, Arni Mar

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights