The development and application of new computational tools for working with viral metagenomic data

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

The development and application of new computational tools for working with viral metagenomic data Kitson, Ezra

Abstract

Next generation, high throughput sequencing has revolutionised the way in which we are able to view the microbial world. We have now generated a large volume of metagenomic sequence data describing viruses and bacteria in diverse environments across the planet. These data require computational processing in order to be used in further analysis. Manipulating the data in the way a bioinformatician wants is often a major difficulty in a metagenomic research. There are two reasons for this. One is that, as the field is nascent, there are many useful data processing tasks that do not yet have published computational tools. A second is that the computational tools that have been published to date are often poorly documented and complicated making them difficult to use in a routine application. Research in this thesis focusses on developing simple computational tools for managing viral metagenomic data. Viral metagenomic data presents the bioinformatician with specific difficulties owing to its size, poor quality and largely novel sequence content. Four new computational tools for managing viral metagenomic data are presented and benchmarked here. Three of these tools expedite everyday researching tasks, automating a process that would otherwise be done manually. The fourth, VHost-Classifier, allows a new scientific question to be asked using viral metagenomic data. In the final chapter VHost-Classifier is applied to analyse viruses published in the NCBI taxonomy database by host organism. The results reveal a large anthropocentric bias in viral sequencing.

Item Metadata

Title	The development and application of new computational tools for working with viral metagenomic data
Creator	Kitson, Ezra
Publisher	University of British Columbia
Date Issued	2019
Description	Next generation, high throughput sequencing has revolutionised the way in which we are able to view the microbial world. We have now generated a large volume of metagenomic sequence data describing viruses and bacteria in diverse environments across the planet. These data require computational processing in order to be used in further analysis. Manipulating the data in the way a bioinformatician wants is often a major difficulty in a metagenomic research. There are two reasons for this. One is that, as the field is nascent, there are many useful data processing tasks that do not yet have published computational tools. A second is that the computational tools that have been published to date are often poorly documented and complicated making them difficult to use in a routine application. Research in this thesis focusses on developing simple computational tools for managing viral metagenomic data. Viral metagenomic data presents the bioinformatician with specific difficulties owing to its size, poor quality and largely novel sequence content. Four new computational tools for managing viral metagenomic data are presented and benchmarked here. Three of these tools expedite everyday researching tasks, automating a process that would otherwise be done manually. The fourth, VHost-Classifier, allows a new scientific question to be asked using viral metagenomic data. In the final chapter VHost-Classifier is applied to analyse viruses published in the NCBI taxonomy database by host organism. The results reveal a large anthropocentric bias in viral sequencing.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2019-04-23
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0378369
URI	http://hdl.handle.net/2429/69887
Degree (Theses)	Master of Science - MSc
Program (Theses)	Microbiology and Immunology
Affiliation	Science, Faculty of; Microbiology and Immunology, Department of
Degree Grantor	University of British Columbia
Graduation Date	2019-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

The development and application of new computational tools for working with viral metagenomic data Kitson, Ezra

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights