UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Combating the cold-start user problem in collaborative filtering recommender systems Biswas, Sampoorna


For tackling the well known cold-start user problem in collaborative filtering recommender systems, one approach is to recommend a few items to a cold-start user and use the feedback to learn her preferences. This can then be used to make good recommendations to the cold user. In the absence of a good initial estimate of the preferences, the recommendations are like random probes. If the items are not chosen judiciously, both bad recommendations and too many recommendations may turn off a user. We study the cold-start user problem for the two main types of collaborative filtering methods -- neighbourhood-based and matrix factorization. We formalize the problem by asking what are the b (typically a small number) items we should recommend to a cold-start user, in order to learn her preferences best, and define what it means to do that under each framework. We cast the problem as a discrete optimization problem, called the optimal interview design (OID) problem, and study two variants -- OID-NB and OID-MF -- for the two frameworks. We present multiple non-trivial results, including NP-hardness as well as hardness of approximation for both. We further study supermodularity/submodularity and monotonicity properties for the objective functions of the two variants. Finally, we propose efficient algorithms and comprehensively evaluate them. For OID-NB, we propose a greedy algorithm, and experimentally evaluate it on 2 real datasets, where it outperforms all the baselines. For OID-MF, we discuss several scalable heuristic approaches for identifying the b best items to recommend to the user and experimentally evaluate their performance on 4 real datasets. Our experiments show that our proposed accelerated algorithms significantly outperform the prior art, while obtaining similar or lower error in the learned user profile as well as in the rating predictions made, on all the datasets tested.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International

Usage Statistics