UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Approaches to genome analysis through the application of graph theory Kaye, Alice M.

Abstract

The human reference genome provides a framework against which the analysis and interpretation of an individual’s genome can be performed. Over the past twenty years the cost of genome sequencing has dropped from a prohibitive amount of hundreds of millions of dollars, to just a few thousand dollars. This has brought genome sequencing in line with the cost of other diagnostic medical tests, leading to a rapid uptake in both clinical and research settings. As a consequence of this global spread, deficiencies and population-specific inequities have emerged from the use of a framework that relies upon a single linear reference sequence. Partial, ad-hoc solutions, such as the introduction of alternative sequences for sections of the genome, have provided a stopgap but fail to fully represent the wealth of information now known about the level of variation that exists within and between populations. This thesis presents an alternative perspective on how we can take advantage of new computational methods to enhance the reference genome in the era of widespread sequencing and big data. An argument is given to motivate the revaluation of the role of the reference genome, and calls for a non-indexed, mutable reference framework with the crucial indexing methods to be shifted from the linear reference to a raw read set. A patented, edge-labelled, cyclic, graph-based model, the GNOmics Graph Model, is introduced as a flexible framework against which read alignment and variant calling can be performed. The value of indexing raw reads is explored through a published tool, FlexTyper, which allows a read set to be screened for informative markers. While there is still an ongoing global discussion as to how best to improve the reference genome, this thesis provides a thought-provoking reconceptualisation of applied human genome analysis.

Item Media

Item Citations and Data

License

Attribution-NonCommercial-NoDerivatives 4.0 International

Usage Statistics