- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Flexible and efficient exploration of rated datasets
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Flexible and efficient exploration of rated datasets Kolloju, Naresh Kumar
Abstract
As users increasingly rely on collaborative rating sites to achieve mundane tasks such as purchasing a product or renting a movie, they are facing the data deluge of ratings and reviews. Traditionally, the exploration of rated data sets has been enabled by rating averages that allow user-centric, itemcentric and top-k exploration. More speci cally, canned queries on user demographics aggregate opinion for an item or a collection of items such as 18-29 year old males in CA rated the movie The Social Network at 8:2 on average. Combining ratings, demographics, and item attributes is a powerful exploration mechanism that allows operations such as comparing the opinion of the same users for two items, comparing two groups of users on their opinion for a given class of items, and nding a group whose rating distribution is nearly unanimous for an item. To enable those operations, it is necessary to (i) adopt the right measure to compare ratings, and to (ii) develop e cient algorithms to nd relevant <user,item,rating> groups. We argue that rating average is a weak measure for capturing such comparisons. We propose contrasting and querying rating distributions, instead, using the Earth Mover's Distance (EMD), a measure that intuitively re ects the amount of work needed to transform one distribution into another. We show that the problem of nding groups matching given rating distributions is NP-hard under di erent settings and develop appropriate heuristics. Our experiments on real and synthetic datasets validate the utility of our approach and demonstrate the scalability of our algorithms.
Item Metadata
Title |
Flexible and efficient exploration of rated datasets
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2013
|
Description |
As users increasingly rely on collaborative rating sites to achieve mundane
tasks such as purchasing a product or renting a movie, they are facing the
data deluge of ratings and reviews. Traditionally, the exploration of rated
data sets has been enabled by rating averages that allow user-centric, itemcentric
and top-k exploration. More speci cally, canned queries on user
demographics aggregate opinion for an item or a collection of items such as
18-29 year old males in CA rated the movie The Social Network at 8:2 on
average. Combining ratings, demographics, and item attributes is a powerful
exploration mechanism that allows operations such as comparing the
opinion of the same users for two items, comparing two groups of users on
their opinion for a given class of items, and nding a group whose rating
distribution is nearly unanimous for an item. To enable those operations,
it is necessary to (i) adopt the right measure to compare ratings, and to
(ii) develop e cient algorithms to nd relevant <user,item,rating> groups.
We argue that rating average is a weak measure for capturing such comparisons.
We propose contrasting and querying rating distributions, instead,
using the Earth Mover's Distance (EMD), a measure that intuitively re
ects
the amount of work needed to transform one distribution into another. We
show that the problem of nding groups matching given rating distributions
is NP-hard under di erent settings and develop appropriate heuristics.
Our experiments on real and synthetic datasets validate the utility of our
approach and demonstrate the scalability of our algorithms.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2013-08-31
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0052206
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2013-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International