Using semantic document representation to increase performance in information retrieval

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Using semantic document representation to increase performance in information retrieval Rongione, Nicholas

Abstract

The first computational Information Retrieval projects were straightforward encodings of card catalogues and look-up tables that had existed before. Soon after, these electronic indices evolved into more advanced structures. Primary among those are the Inverted Index (II) and the Vector Model (VM). These new structures expanded the domain of indexing science to large sets of texts heterogeneous with respect to content and length. This thesis presents an overview and critique of these techniques. It is found that these word driven methods are limited because they deal statistically with linguistic phenomena that resist that type of analysis. It goes on to suggest that semantic analysis of the target documents is needed to go beyond the keyword barrier' of traditional methods. After a discussion of three ways to encode the semantic content of sentences, JackendofPs Lexical Conceptual Structure (LCS) is selected as most appropriate for this domain. The FINDER semantic information retrieval system, which uses the LCS representation for target documents and user queries, is described. This description, along with example searches, support the claim that a semantic information retrieval system has practical advantages over one that uses traditional methods.

Item Metadata

Title	Using semantic document representation to increase performance in information retrieval
Creator	Rongione, Nicholas
Publisher	University of British Columbia
Date Issued	1999
Description	The first computational Information Retrieval projects were straightforward encodings of card catalogues and look-up tables that had existed before. Soon after, these electronic indices evolved into more advanced structures. Primary among those are the Inverted Index (II) and the Vector Model (VM). These new structures expanded the domain of indexing science to large sets of texts heterogeneous with respect to content and length. This thesis presents an overview and critique of these techniques. It is found that these word driven methods are limited because they deal statistically with linguistic phenomena that resist that type of analysis. It goes on to suggest that semantic analysis of the target documents is needed to go beyond the keyword barrier' of traditional methods. After a discussion of three ways to encode the semantic content of sentences, JackendofPs Lexical Conceptual Structure (LCS) is selected as most appropriate for this domain. The FINDER semantic information retrieval system, which uses the LCS representation for target documents and user queries, is described. This description, along with example searches, support the claim that a semantic information retrieval system has practical advantages over one that uses traditional methods.
Extent	3365297 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2009-06-26
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051616
URI	http://hdl.handle.net/2429/9740
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	1999-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1999-0593.pdf -- 3.21MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

Using semantic document representation to increase performance in information retrieval Rongione, Nicholas

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights