Database-driven whole genome profiling for stratifying Triple Negative Breast Cancers (TNBC)

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Database-driven whole genome profiling for stratifying Triple Negative Breast Cancers (TNBC) Asiimwe, Rebecca

Abstract

Whole genome sequencing of cancers for variant discovery and patient stratification generates vast amounts of data including on the order of 10^6 relevant features per sample. The current practice is to store this data in flat files whose structure complicates tasks required to optimally store, query and conduct integrative data mining and analysis of orthogonally collected data such as phenotype and clinical outcomes. In this study we designed, developed and optimized an object-relational database to support optimal storage, integration, querying, analysis and visualization of largescale whole genome profiling data at the level of genome-wide individual somatic variants (CNAs, SNVs, SVs and indels). We structured variant data from analytics pipelines and implemented a PostgreSQL database in which we bulk-loaded clinical outcomes and somatic variants from 88 Triple Negative Breast cancers (TNBCs). Our focus on TNBC was driven by the current and urgent need for better characterization of the genetic, molecular and clinical biomarkers of this heterogeneous, more aggressive and difficult to treat breast cancer subtype for which there are limited treatment options. Secondly, our inclination to whole genome sequencing (WGS) was attributed to the ability of WGS approaches to provide an in-depth analysis and elucidation of the landscape of mutations occurring across the genome that may reflect specific mutational processes as targetable vulnerabilities in human cancers. However, a whole genome sequencing study in TNBC at scale to investigate genomic properties as a stratification tool has not been undertaken. Hinged on these notions, we applied the developed database and present its indispensable utility in supporting optimal access, exploration, analysis and visualization of genomic contents of patient tumours to support quality control, inference of patterns of mutations and genomic events underpinning a patient’s disease, population level aggregation analysis, gene mutation visualization and patient stratification. Furthermore, we developed Genome-Miner, a web-based database user interface to additionally support interactive and convenient access, sharing, interrogation and visualization of collected data across various research groups. We anticipate the database infrastructure we present will have utility in other whole genome studies and push the field beyond the use of flat files for managing whole genome datasets in cancer.

Item Metadata

Title	Database-driven whole genome profiling for stratifying Triple Negative Breast Cancers (TNBC)
Creator	Asiimwe, Rebecca
Publisher	University of British Columbia
Date Issued	2019
Description	Whole genome sequencing of cancers for variant discovery and patient stratification generates vast amounts of data including on the order of 10^6 relevant features per sample. The current practice is to store this data in flat files whose structure complicates tasks required to optimally store, query and conduct integrative data mining and analysis of orthogonally collected data such as phenotype and clinical outcomes. In this study we designed, developed and optimized an object-relational database to support optimal storage, integration, querying, analysis and visualization of largescale whole genome profiling data at the level of genome-wide individual somatic variants (CNAs, SNVs, SVs and indels). We structured variant data from analytics pipelines and implemented a PostgreSQL database in which we bulk-loaded clinical outcomes and somatic variants from 88 Triple Negative Breast cancers (TNBCs). Our focus on TNBC was driven by the current and urgent need for better characterization of the genetic, molecular and clinical biomarkers of this heterogeneous, more aggressive and difficult to treat breast cancer subtype for which there are limited treatment options. Secondly, our inclination to whole genome sequencing (WGS) was attributed to the ability of WGS approaches to provide an in-depth analysis and elucidation of the landscape of mutations occurring across the genome that may reflect specific mutational processes as targetable vulnerabilities in human cancers. However, a whole genome sequencing study in TNBC at scale to investigate genomic properties as a stratification tool has not been undertaken. Hinged on these notions, we applied the developed database and present its indispensable utility in supporting optimal access, exploration, analysis and visualization of genomic contents of patient tumours to support quality control, inference of patterns of mutations and genomic events underpinning a patient’s disease, population level aggregation analysis, gene mutation visualization and patient stratification. Furthermore, we developed Genome-Miner, a web-based database user interface to additionally support interactive and convenient access, sharing, interrogation and visualization of collected data across various research groups. We anticipate the database infrastructure we present will have utility in other whole genome studies and push the field beyond the use of flat files for managing whole genome datasets in cancer.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2021-03-31
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0377717
URI	http://hdl.handle.net/2429/69381
Degree (Theses)	Master of Science - MSc
Program (Theses)	Bioinformatics
Affiliation	Science, Faculty of
Degree Grantor	University of British Columbia
Graduation Date	2019-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Database-driven whole genome profiling for stratifying Triple Negative Breast Cancers (TNBC) Asiimwe, Rebecca

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights