UBC Theses and Dissertations
Mapping complex genomic translocations using Strand-seq Yuen, Michael Wai-Keong
Template strand sequencing (Strand-seq) is a single cell sequencing approach which maintains 5’ -> 3’ directionality of sequence reads. I hypothesized that the directional information preserved can be used to map complex translocation events. Translocations often disrupt gene expression by reshuffling regulatory elements or by formation of novel fusion transcripts. Yet, detection is often difficult, confounded by complexities of the Structural Variations (SVs). I chose a cell line derived from a patient with pediatric Acute Lymphoblastic Leukemia (iALL) with a known complex karyotype. My aim was to explore Strand-seq’s ability in identifying breakpoints, linking translocation partners and resolving the configuration of SV, comparing low coverage Strand-seq data against high coverage Whole Genome Sequence (WGS) data as reference. The iALL cells selected for my study harbor complex translocations involving 4 chromosomes with 5 breakpoint positions that were previously validated by Fluorescent In Situ Hybridization (FISH). BreakpointR, a novel pipeline for Strand-seq analysis, was able to identify 18 breakpoints, 5 which were isolated for further analysis. These 5 breakpoints were identified with a resolution of 5-60kb, overlapping with the genomic positions of breakpoints identified by WGS analysis, validating the accuracy of BreakpointR. Despite the lower sequencing coverage of Strand-seq, 18 breakpoints were detected against WGS’s 119 breakpoints. Next, I developed a workflow to link translocation partners involved in the breakpoints, successfully linking 4 of 5 translocation partners; the final fragment remained unresolved due to lack of reads within the genomic interval, a limitation of low sequence coverage. By comparison, WGS successfully linked 4 of 5 translocation partners, its limitations of mapping across repetitive regions resulting in a different unresolved fragment. Post-translocation-partner matching and single cell resolution from Strand-seq allowed us to further interrogate expected breakpoints for each single cell. Strand-seq analysis identified an inversion of the 100kb fragment in chromosome 11, validated with Sanger sequencing, representing an additional layer of complexity not identified by the other approaches. I conclude that the application of Strand-seq should be further explored in the areas of SV mapping as it has been proven useful for complementing the inherent difficulties of complex SV mapping across repetitive regions.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International