UBC Theses and Dissertations
Privacy-preserving publishing of moving objects databases Yarovoy, Roman
Moving Objects Databases (MOD) have gained popularity as a subject for research due to the latest developments in the positioning technologies and mobile networking. Analysis of mobility data can be used to discover and deliver knowledge that can enhance public welfare. For instance, a study of traffic patterns and congestion trends can reveal some information that can be used to improve routing and scheduling of public transit vehicles. To enable analysis of mobility data, a MOD must be published. However, publication of MOD can pose a threat to location privacy of users, whose movement is recorded in the database. A user's location at one or more time points can be publicly available prior to the publication of MOD. Based on this public knowledge, an attacker can potentially find the user's entire trajectory and learn his/her positions at other time points, which constitutes privacy breach. This public knowledge is a user's quasi-identifier (QID), i.e. a set of attributes that can uniquely identify the user's trajectory in the published database. We argue that unlike in relational microdata, where all tuples have the same set of quasi-identifiers, in mobility data, the concept of quasi-identifier must be modeled subjectively on an individual basis. In this work, we study the problem of privacy preserving publication of MOD. We conjecture that each Moving Object (MOB) may have a distinct QID. We develop a possible attack model on the published MOD given public knowledge of some or all MOBs. We develop k-anonymity model (based on classical k-anonymity), which ensures that every object is indistinguishable (with respect to its QID) from at least k-1 other objects, and show that this model is impervious to the proposed attack model. We employ space generalization to achieve MOB anonymity. We propose three anonymization algorithms that generate a MOD that satisfies the k-anonymity model, while minimizing the information loss. We conduct several sets of experiments on synthetic and real-world data sets of vehicular traffic to analyze and evaluate our proposed algorithms.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International