UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Streaming algorithms with differential privacy guarantee Gong, Zehui

Abstract

Given a data stream (a sequence of items), the distinct element estimation problem aims to approximate the number of unique items in the stream. We present two streaming algorithms to solve the distinct element estimation problem, and we prove that both are differentially private. The first algorithm solves the decision version of distinct element estimation, and then solves the general version by making calls to the decision version. This algorithm is conceptually simple, but somewhat inefficient due to this reduction to the decision version. We then extend this algorithm and its analysis to handle deletions. We show that the original algorithm and its extension use logarithmic space and can output a good estimate of distinct elements with high probability. The second algorithm is based on the Johnson-Lindenstrauss dimensionality reduction technique, but it replaces the Gaussian distribution with a stable distribution. With an appropriate choice of parameters, this approach can estimate distinct elements up to arbitrary multiplicative accuracy with high probability. We prove asymptotically tight upper and lower bounds for the density of these stable distributions using an infinite sum representation. Using those bounds, for the special case of dimensionality reduction to one dimension, we prove that this algorithm is differentially private.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International