Extractive summarization of long documents by combining global and local context

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Extractive summarization of long documents by combining global and local context Xiao, Wen

Abstract

In this thesis, we propose a novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers , Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1 and ROUGE-2 scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Besides, we also show that when the topic segment information is not explicitly provided, if we apply a pretrained topic segmentation model that splits documents into sections, our model is still competitive with state-of-the-art models.

Item Metadata

Title	Extractive summarization of long documents by combining global and local context
Creator	Xiao, Wen
Publisher	University of British Columbia
Date Issued	2019
Description	In this thesis, we propose a novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers , Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1 and ROUGE-2 scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Besides, we also show that when the topic segment information is not explicitly provided, if we apply a pretrained topic segmentation model that splits documents into sections, our model is still competitive with state-of-the-art models.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2019-08-20
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0380504
URI	http://hdl.handle.net/2429/71354
Degree	Master of Science - MSc
Program	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2019-09
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Extractive summarization of long documents by combining global and local context Xiao, Wen

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights