UBC Theses and Dissertations
Evolutionary conservation of long intergenic non-coding RNA genes in Arabidopsis Hammel, Alexander John
Long intergenic non-coding RNA (lincRNA) genes are a poorly studied class of transcripts, particularly in plants. Because of the low levels of expression, high tissue speci- city, and rapid rate of evolution of lincRNA transcripts, the discovery and functional annotation of these molecules is a signi cant challenge. Here, I report the annotation of 201 new lincRNA transcripts in Arabidopsis thaliana discovered using the results of a single RNA-seq experiment of a normalized library. Using these sequences, along with the 6 480 lincRNA genes annotated by Liu et al. (2012), I performed a pairwise sequence alignment experiment with the genomes of 22 plant species in order to discover highly conserved sequences within lincRNA loci. Of the 6 681 lincRNA sequences examined, 3 374 have highly conserved sequences supported by multiple genomic alignments to other species. Six of these show evidence of ongoing reduced sequence rate evolution when single-nucleotide variant data from the recent evolutionary history of Arabidopsis thaliana. The rate of retention of these conserved regions within the Brassicaceae suggests a much higher rate of sequence turnover in lincRNA genes compared with protein coding genes. Structural variant data from 80 di erent A. thaliana ecotypes suggests that lincRNA genes su er deletions of the entire locus from the genome with appreciable frequency: 570 of the lincRNA loci examined are entirely missing from at least one A. thaliana strain. These results suggest an intriguing mixture of rapid sequence evolution with short, highly-conserved islands in lincRNA genes.
Item Citations and Data
Attribution-NonCommercial-NoDerivs 2.5 Canada