Information flow identification in large email datasets

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Information flow identification in large email datasets Akuney, Arseniy

Abstract

Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time.

Item Metadata

Title	Information flow identification in large email datasets
Creator	Akuney, Arseniy
Publisher	University of British Columbia
Date Issued	2011
Description	Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2011-12-23
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0052116
URI	http://hdl.handle.net/2429/39847
Degree (Theses)	Master of Science - MSc
Program (Theses)	Computer Science
Affiliation	Science, Faculty of; Computer Science, Department of
Degree Grantor	University of British Columbia
Graduation Date	2012-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Information flow identification in large email datasets Akuney, Arseniy

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights