UBC Theses and Dissertations
Efficient approximation of social relatedness over large social networks and application to query enabled recommender systems Esfandiar, Pooya
Social relatedness measures such as the Katz score and the commute time between pairs of nodes have been subject of significant research effort motivated by social network problems including link prediction, anomalous link detection, and collaborative filtering. In this thesis, we are interested in computing: (1) the score for a given pair of nodes, and (2) the top-k nodes with the highest scores from a specific source node. Unlike most traditional approaches, ours scale to large networks with hundreds of thousands of nodes. We introduce an efficient iterative algorithm which calculates upper and lower bounds for the pairwise measures. For the top-k problem, we propose an algorithm that only has access to a small subset of nodes. Our approaches rely on techniques developed in numerical linear algebra and personalized PageRank computing. Using three real-world networks, we examine scalability and accuracy of our algorithms as in a short time as milliseconds to seconds. We also hypothesize that incorporating item based tags into a recommender system will improve its performance. We model such a system as a tri-partite graph of users, items and tags and use this graph to define a scoring function making use of graph-based proximity measures. Exactly calculating the item scores is computationally expensive, so we use the proposed top-k algorithm to calculate the scores. The usefulness and efficiency of the approaches are compared to a simple, non-graph based, approach. We evaluate these approaches on a combination of the Netflix ratings data and the IMDb tag data.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International