An epistemological approach to domain-specific multiple biographical document summarization

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

An epistemological approach to domain-specific multiple biographical document summarization Tennessy, Blair

Abstract

Automatic document summarization consists of two tasks: understanding and generation. Understanding is a technique in which relevant content is identified, processed, and annotated. Generation is the process of restating important content in a concise form. As a task for an intelligent system, summarization is a crucial operation: by what process can you succinctly restate pertinent information contained within a set of documents, citing only the essential facts relevant to the query at hand? In this thesis we demonstrate a conceptual approach to multiple biographical document summarization. Specifically, we apply domain-specific semantic and temporal document understanding methods to multi-document biographical summarization. Our purpose is to more fully address the important criteria—routinely cited yet rarely approached—of multi-document summarization. These criteria, namely the discovery and resolution of identical, complementary, or contradictory statements, have been roughly treated using general lexico-semantic methods. We maintain that the general semantically-informed methods previously devised for unrestricted text are not completely suitableto biography summarization; instead, it is our conviction that one must have at least a partial conceptual understanding of the subject's domain in order to reason about the importance and verity of document information. We hold that this is especially true for establishing temporal relationships, which is at the heart of biography understanding and production. What we demonstrate in this thesis is that an extremely course approximation to an epistemological system based on concepts is able to satisfy the criteria of a multi-document summarization system in a particular domain. Our methods, while primitive, provide a lower-bound on the performance of such a system.

Item Metadata

Title	An epistemological approach to domain-specific multiple biographical document summarization
Creator	Tennessy, Blair
Publisher	University of British Columbia
Date Issued	2006
Description	Automatic document summarization consists of two tasks: understanding and generation. Understanding is a technique in which relevant content is identified, processed, and annotated. Generation is the process of restating important content in a concise form. As a task for an intelligent system, summarization is a crucial operation: by what process can you succinctly restate pertinent information contained within a set of documents, citing only the essential facts relevant to the query at hand? In this thesis we demonstrate a conceptual approach to multiple biographical document summarization. Specifically, we apply domain-specific semantic and temporal document understanding methods to multi-document biographical summarization. Our purpose is to more fully address the important criteria—routinely cited yet rarely approached—of multi-document summarization. These criteria, namely the discovery and resolution of identical, complementary, or contradictory statements, have been roughly treated using general lexico-semantic methods. We maintain that the general semantically-informed methods previously devised for unrestricted text are not completely suitableto biography summarization; instead, it is our conviction that one must have at least a partial conceptual understanding of the subject's domain in order to reason about the importance and verity of document information. We hold that this is especially true for establishing temporal relationships, which is at the heart of biography understanding and production. What we demonstrate in this thesis is that an extremely course approximation to an epistemological system based on concepts is able to satisfy the criteria of a multi-document summarization system in a particular domain. Our methods, while primitive, provide a lower-bound on the performance of such a system.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2010-01-06
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0051642
URI	http://hdl.handle.net/2429/17619
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2006-05
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_2006-0117.pdf -- 6.44MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

An epistemological approach to domain-specific multiple biographical document summarization Tennessy, Blair

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights