UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

CIS-features mediating CAG/CTG repeat instability the Satellog database, and candidate repeat prioritization in schizophrenia Missirlis, Perseus Ioannis


Polyglutamine repeat expansions in the coding regions of unrelated genes have been implicated in the neurodegenerative phenotype of nine separate diseases. However, little is known about the role of flanking c/'s-sequences in mediating this repeat instability. Brock et al. identified an association between flanking GC content and CAG/CTG repeat instability at many of these disease loci by using a relative measure of repeat instability called 'expandability'. Using this measure, we have extended the analysis of Brock and colleagues and utilized the expandability metric to associate other features theorized to contribute to CAG/CTG repeat instability such as repeat length and purity, proximity to CCCTC-binding factor (CTCF) binding sites, and the nucleosome formation potential of the surrounding DNA. Our results confirmed earlier relationships regarding flanking G C content and CAG/CTG repeat instability and also suggest a novel one involving flanking CTCF binding sites. Conversely, no relationships between expandability and repeat length, purity, and nucleosome formation were detected. Anticipation refers to the progressive worsening of a disease phenotype and earlier age of onset in successive generations. Anticipation has been reported in a number of diseases in which repeat expansion may have a role in etiology. We developed Satellog, a database that catalogs all pure 1-16 repeat unit repeats in the human genome along with supplementary data of use for the prioritization of repeats in disease association studies. For each pure repeat we calculate the percentile rank of its length relative to other repeats of the same class in the genome, its polymorphism within UniGene clusters, its location either within or adjacent to EnsEMBL-defined genes, and its expression profile in normal tissues according to the GeneNote database. By examining the global repeat polymorphism profile, we found that highly polymorphic coding repeats were mostly restricted to trinucleotide repeats, whereas a wider range of repeat unit lengths were tolerated in untranslated sequence. We also found that 3'-UTR sequence tolerates more repeat polymorphisms than 5'-UTR or exonic sequence. Lastly, we use Satellog to prioritize repeats for disease-association studies in schizophrenia. Satellog is available as a freely downloadable MySQL and web-based database.

Item Media

Item Citations and Data


For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.