On managing visibility of resources in social networking sites

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

On managing visibility of resources in social networking sites Su, Ying

Abstract

The social network sites has been very popular these days. In these systems, re sources are associated with each user (i.e., network node) in the network, and these resources include both soft and physical resources, such as pictures, videos, books, etc. Users would like to control who can see their resources, e.g., any one, only friends, friends and families, friends of friends, etc. The rules defining who can see the resources are called visibility views, and in our system they are identified by the owner nodes and regular path expressions. The users can issue queries for discovering resources, e.g., “Find all lawnmowers in my three degree network that are visible to me”. The evaluation process needs to check all candidate resources to see if they are visible to the query issuer, and this requires computing the visibility views associated with every candidate resources. The visibility views are expensive to evaluate. In order to facilitate the visibility query answering, it is necessary to pre-compute and materialize the views in the so called “cache”. But because of the tremendous volume of the whole view data, we have to select only a portion of them to materialize, based on metrics such as the views’ sizes, the costs to evaluate them, the popularity of references, the temporal locality of requested views, and the correlation relationships among them. The problem of trying to select the views in a static way is called the static view selection problem, and it has been studied extensively by the database community, mostly for answering OLAP queries in data warehouses. We call it “static” because it pre-computes and materializes the views in one lump-sum, based on statistics accumulated before the query sequence starts. This problem has a deep root from the caching problem studied in the theory and system communities. The researches done on caching policies solve the problem in an ad-hoc dynamic way by making decisions at each request, which make use of the temporal locality. Many ad-hoc policies have been proposed in the literature. Specifically, there is a branch of research done based on the so called Independent Reference Model, in which view requests are assumed to form a fixed time-independent sequence. It was shown that under this model optimal policies can be found. Coffman and Denning [13] presented a policy called A₀ that is optimal with respect the long-run average metric for paging, where the cached objects (i.e., memory pages) have uniform sizes and retrieval costs. Bahat [7] later presented a stationary Markov replacement policy C₀, which is optimal for cases in which the retrieval costs for the cached objects are arbitrary but their sizes are uniform. In this thesis we address the problem one step further: to assume both the sizes and costs are arbitrary. We reveal the intrinsic relationship between the static view selection problem and the caching problem, and show that, still under the independent reference model, any dynamic replacement policy (random or deterministic) are no better than the optimal static selection algorithm in terms of the long-run average metric. We then present a branch of optimal history dependent replacement policies for the arbitrary sizes and arbitrary costs case, and show that policy A₀ and C₀ are special cases of our policy. Furthermore, we prove that a policy is optimal if and only if the induced stochastic process is a unichain. We also study the approximate static algorithms and policies: we bring up polynomial static algorithms that are both K-approximate with regard to the fractional optimum solution and HKF-approximate to the integral optimum solution, where K is the size of the cache, K’ is the total size of all view objects minus K, and HKI is the harmonic number of K’. In addition, we present a K-competitive dynamic policy and show that K is the best approximation ratio for both the static algorithms and dynamic policies.

Item Metadata

Title	On managing visibility of resources in social networking sites
Creator	Su, Ying
Publisher	University of British Columbia
Date Issued	2008
Description	The social network sites has been very popular these days. In these systems, re sources are associated with each user (i.e., network node) in the network, and these resources include both soft and physical resources, such as pictures, videos, books, etc. Users would like to control who can see their resources, e.g., any one, only friends, friends and families, friends of friends, etc. The rules defining who can see the resources are called visibility views, and in our system they are identified by the owner nodes and regular path expressions. The users can issue queries for discovering resources, e.g., “Find all lawnmowers in my three degree network that are visible to me”. The evaluation process needs to check all candidate resources to see if they are visible to the query issuer, and this requires computing the visibility views associated with every candidate resources. The visibility views are expensive to evaluate. In order to facilitate the visibility query answering, it is necessary to pre-compute and materialize the views in the so called “cache”. But because of the tremendous volume of the whole view data, we have to select only a portion of them to materialize, based on metrics such as the views’ sizes, the costs to evaluate them, the popularity of references, the temporal locality of requested views, and the correlation relationships among them. The problem of trying to select the views in a static way is called the static view selection problem, and it has been studied extensively by the database community, mostly for answering OLAP queries in data warehouses. We call it “static” because it pre-computes and materializes the views in one lump-sum, based on statistics accumulated before the query sequence starts. This problem has a deep root from the caching problem studied in the theory and system communities. The researches done on caching policies solve the problem in an ad-hoc dynamic way by making decisions at each request, which make use of the temporal locality. Many ad-hoc policies have been proposed in the literature. Specifically, there is a branch of research done based on the so called Independent Reference Model, in which view requests are assumed to form a fixed time-independent sequence. It was shown that under this model optimal policies can be found. Coffman and Denning [13] presented a policy called A₀ that is optimal with respect the long-run average metric for paging, where the cached objects (i.e., memory pages) have uniform sizes and retrieval costs. Bahat [7] later presented a stationary Markov replacement policy C₀, which is optimal for cases in which the retrieval costs for the cached objects are arbitrary but their sizes are uniform. In this thesis we address the problem one step further: to assume both the sizes and costs are arbitrary. We reveal the intrinsic relationship between the static view selection problem and the caching problem, and show that, still under the independent reference model, any dynamic replacement policy (random or deterministic) are no better than the optimal static selection algorithm in terms of the long-run average metric. We then present a branch of optimal history dependent replacement policies for the arbitrary sizes and arbitrary costs case, and show that policy A₀ and C₀ are special cases of our policy. Furthermore, we prove that a policy is optimal if and only if the induced stochastic process is a unichain. We also study the approximate static algorithms and policies: we bring up polynomial static algorithms that are both K-approximate with regard to the fractional optimum solution and HKF-approximate to the integral optimum solution, where K is the size of the cache, K’ is the total size of all view objects minus K, and HKI is the harmonic number of K’. In addition, we present a K-competitive dynamic policy and show that K is the best approximation ratio for both the static algorithms and dynamic policies.
Extent	987277 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-03-04
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0051240
URI	http://hdl.handle.net/2429/5506
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2008-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

On managing visibility of resources in social networking sites Su, Ying

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights