UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Cluster-based information retrieval modeling Sze, Richard


Cluster-based information retrieval, an extension of information retrieval strategy, is based on the assumption that a document collection can be organized into a set of topics so that a user can enhance retrieval effectiveness. The cluster-based IR model assumes that queries can be associated with clusters that contain high concentrations of relevant documents, and that such association can lead to gains in retrieval effectiveness. Earlier studies, however, have provided negative to mixed results for the performance of the model. Moreover, studies are lacking which investigate the potential of the model in situations where queries are manually associated with the appropriate clusters. The goal of this thesis is to provide evidence for the validity of the cluster-base IR model's effectiveness through conducting extensive empirical studies which explore alternative schemes of the model on a large scale and according to a well-accepted benchmark. Investigation shows that the cluster-based IR model has the potential to enhance retrieval effectiveness, and yet, alternative techniques fail to actually achieve enhanced effectiveness.

Item Media

Item Citations and Data


For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.