UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Using semantic document representation to increase performance in information retrieval Rongione, Nicholas

Abstract

The first computational Information Retrieval projects were straightforward encodings of card catalogues and look-up tables that had existed before. Soon after, these electronic indices evolved into more advanced structures. Primary among those are the Inverted Index (II) and the Vector Model (VM). These new structures expanded the domain of indexing science to large sets of texts heterogeneous with respect to content and length. This thesis presents an overview and critique of these techniques. It is found that these word driven methods are limited because they deal statistically with linguistic phenomena that resist that type of analysis. It goes on to suggest that semantic analysis of the target documents is needed to go beyond the keyword barrier' of traditional methods. After a discussion of three ways to encode the semantic content of sentences, JackendofPs Lexical Conceptual Structure (LCS) is selected as most appropriate for this domain. The FINDER semantic information retrieval system, which uses the LCS representation for target documents and user queries, is described. This description, along with example searches, support the claim that a semantic information retrieval system has practical advantages over one that uses traditional methods.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.