- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Summarization of partial email threads : silver standards...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Summarization of partial email threads : silver standards and bayesian surprise Johnson, Jordon Kent
Abstract
We define and motivate the problem of summarizing partial email threads. This problem introduces the challenge of generating reference summaries for these partial threads when extractive human annotation is only available for the threads as a whole, since gold standard annotation intended to summarize a completed email thread may not always be equally applicable to each of its partial threads, particularly when the human-selected sentences are not uniformly distributed within the threads. We propose a framework for generating these reference summaries with arbitrary length in an oracular manner by exploiting existing gold standard summaries for completed email threads. We also propose and evaluate two sentence scoring functions that can be used in this "silver standard" framework, and we are making the resulting datasets publicly available. In addition, we apply a recent unsupervised method based on Bayesian Surprise that incorporates background knowledge to partial thread summarization, extend that method with conversational features, and modify the mechanism by which it handles information redundancy. Experiments with our partial thread summarizers indicate comparable or improved performance relative to a state-of-the-art unsupervised full thread summarizer baseline in most cases; and we have identified areas in which potential vulnerabilities in our methods can be avoided or accounted for. Furthermore, our results suggest that the potential benefits of background knowledge to partial thread summarization should be further investigated with larger datasets.
Item Metadata
Title |
Summarization of partial email threads : silver standards and bayesian surprise
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2018
|
Description |
We define and motivate the problem of summarizing partial email threads. This problem introduces the challenge of generating reference summaries for these partial threads when extractive human annotation is only available for the threads as a whole, since gold standard annotation intended to summarize a completed email thread may not always be equally applicable to each of its partial threads, particularly when the human-selected sentences are not uniformly distributed within the threads. We propose a framework for generating these reference summaries with arbitrary length in an oracular manner by exploiting existing gold standard summaries for completed email threads. We also propose and evaluate two sentence scoring functions that can be used in this "silver standard" framework, and we are making the resulting datasets publicly available. In addition, we apply a recent unsupervised method based on Bayesian Surprise that incorporates background knowledge to partial thread summarization, extend that method with conversational features, and modify the mechanism by which it handles information redundancy. Experiments with our partial thread summarizers indicate comparable or improved performance relative to a state-of-the-art unsupervised full thread summarizer baseline in most cases; and we have identified areas in which potential vulnerabilities in our methods can be avoided or accounted for. Furthermore, our results suggest that the potential benefits of background knowledge to partial thread summarization should be further investigated with larger datasets.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2018-04-18
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0365780
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2018-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International