UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Extractive summarization of long documents by combining global and local context Xiao, Wen

Abstract

In this thesis, we propose a novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers , Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1 and ROUGE-2 scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Besides, we also show that when the topic segment information is not explicitly provided, if we apply a pretrained topic segmentation model that splits documents into sections, our model is still competitive with state-of-the-art models.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International