UBC Faculty Research and Publications

The developing xylem transcriptome and genome-wide analysis of alternative splicing in Populus trichocarpa… Bao, Hua; Li, Eryang; Mansfield, Shawn D; Cronk, Quentin C; El-Kassaby, Yousry A; Douglas, Carl J May 29, 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2012_Article_5079.pdf [ 337.31kB ]
JSON: 52383-1.0221338.json
JSON-LD: 52383-1.0221338-ld.json
RDF/XML (Pretty): 52383-1.0221338-rdf.xml
RDF/JSON: 52383-1.0221338-rdf.json
Turtle: 52383-1.0221338-turtle.txt
N-Triples: 52383-1.0221338-rdf-ntriples.txt
Original Record: 52383-1.0221338-source.json
Full Text

Full Text

RESEARCH ARTICLE Open AccessThe developing xylem transcriptome andgenome-wide analysis of alternative splicing inPopulus trichocarpa (black cottonwood)populationsHua Bao1, Eryang Li2, Shawn D Mansfield3, Quentin CB Cronk2, Yousry A El-Kassaby1 and Carl J Douglas2*AbstractBackground: Alternative splicing (AS) of genes is an efficient means of generating variation in protein structureand function. AS variation has been observed between tissues, cell types, and different treatments in non-woodyplants such as Arabidopsis thaliana (Arabidopsis) and rice. However, little is known about AS patterns in wood-forming tissues and how much AS variation exists within plant populations.Results: Here we used high-throughput RNA sequencing to analyze the Populus trichocarpa (P. trichocarpa) xylemtranscriptome in 20 individuals from different populations across much of its range in western North America. Deeptranscriptome sequencing and mapping of reads to the P. trichocarpa reference genome identified a suite of xylem-expressed genes common to all accessions. Our analysis suggests that at least 36% of the xylem-expressed genes inP. trichocarpa are alternatively spliced. Extensive AS was observed in cell-wall biosynthesis related genes such asglycosyl transferases and C2H2 transcription factors. 27902 AS events were documented and most of these eventswere not conserved across individuals. Differences in isoform-specific read densities indicated that 7% and 13% ofAS events showed significant differences between individuals within geographically separated southern andnorthern populations, a level that is in general agreement with AS variation in human populations.Conclusions: This genome-wide analysis of alternative splicing reveals high levels of AS in P. trichocarpa andextensive inter-individual AS variation. We provide the most comprehensive analysis of AS in P. trichocarpa to date,which will serve as a valuable resource for the plant community to study transcriptome complexity and ASregulation during wood formation.BackgroundAlternative splicing (AS) is considered to be a key factorunderlying increased cellular and functional complexityin higher eukaryotes and has been studied extensively onthe genome-wide scale in humans, other animals, andplants [1-5]. In humans, RNA splice variants with alter-native exon configurations often accumulate differen-tially across different tissues and individuals [1,6-8] andsuch tissue-specific gene isoforms can have importantfunctions in development and in the functioning of dif-ferent cell types.Relatively few studies have investigated genome-widepatterns of AS in plant species [9], but recent resultsfrom Arabidopsis [10] and rice [11] have revealed highlevels of AS that can vary in different organs and underdifferent stress conditions. Next generation highthroughput transcriptome sequencing (RNA-Seq) ana-lysis suggests that 42% of the intron-containing genes inArabidopsis undergo AS [10]. Using a normalized cDNAlibrary derived from flower and seedling tissue, Marquezet al. [5] used deep RNA-Seq transcriptome analysis toshow that over 60% of Arabidopsis intron-containinggenes are alternatively spliced, with intron retention (IR)being the most common form of AS. The analysis re-vealed, however, that over 50% of genes surveyed displaynon-IR AS, and within IR variants, a large number of* Correspondence: carl.douglas@botany.ubc.ca2Department of Botany, University of British Columbia, Vancouver, BCV6T1Z4, CanadaFull list of author information is available at the end of the article© 2013 Bao et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the CreativeCommons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, andreproduction in any medium, provided the original work is properly cited.Bao et al. BMC Genomics 2013, 14:359http://www.biomedcentral.com/1471-2164/14/359cryptic introns were spliced out in-frame. Thus, as inhumans, large scale AS in plants is likely to contributeto proteome and phenotypic diversity.Few studies have addressed AS variation among indi-viduals within a species, although it has been noted thatextensive AS variation in humans may underlie pheno-typic diversity and disease susceptibility [7], and muta-tions affecting alternative splicing can play a role indisease [12]. Recently, RNA-Seq was used for genome-wide analysis of variation in AS splicing and expressionlevels across human individuals [13,14]. High depthtranscriptome sequencing of cell lines derived from twounrelated individuals revealed a large number of genesdifferentially spliced between the two cell lines, andmany of these differences were associated with singlenucleotide polymorphisms (SNPs) found in close prox-imity [13]. This and other studies [14] analyzing expres-sion level variation and AS variation in individuals fromthe human HapMap project suggested that genetic vari-ation (e.g., SNPs) underlies extensive AS and expressionlevel variation among unrelated individuals, which ap-pears to contribute to phenotypic diversity. However,variation of alternative splicing within plant populationhas not yet been subject to genome-wide analysis.The sequencing of the first tree, Populus trichocarpa(black cottonwood; referred to as “poplar” throughout)[15], creates opportunities for genomic studies in this spe-cies, and in particular for investigation of biological pro-cesses important in woody plants, such as perenniality,secondary growth, and secondary xylem (wood) develop-ment [16–18]. P. trichocarpa is the largest deciduous treenative to western North America with a range that extendsfrom California to Alaska, USA, a latitudinal range of over30°. One advantage of woody plants such as P. trichocarpaand other trees is the ability to isolate pure tissue types, i.e.developing xylem and phloem, from woody stems duringthe active growth period, which can be used for studies oftissue-specific transcriptomes and proteomes. Publishedstudies on the poplar xylem transcriptome, and the xylemtranscriptomes of other woody plants, have been largelylimited to use of EST and full-length cDNA resourcesfrom traditional sequencing platforms [19-23]. Further-more, the annotation of AS forms in the 40,668 locicontaining protein-coding transcripts a recent annotationof the P. trichocarpa genome (version 2.2; http://www.phytozome.net/poplar) is limited. Given its large popula-tion size across a large geographical range of varied envi-ronments, and extensive genetic polymorphism [24,25],study of P. trichocarpa transcriptomes offers the oppor-tunity for the systematic identification and characte-rization of splice variants at the population level, whichwill be of critical importance for understanding pheno-typic variation in wood and other traits in the thesepopulations.The advent of next-generation sequencing technolo-gies has now enabled AS analyses at unprecedentedlevels of sensitivity and precision. Analysis of Illumina-based RNA sequencing (RNA-Seq) data has been par-ticularly useful in the study of AS and AS variation inmodel species [1,3,5,10]. We recently used Illuminapaired-end sequencing of xylem expressed transcripts in20 individuals from populations (12 from southernand 8 from northern populations) across much of theP. trichocarpa species range to gain initial insights intothe poplar xylem transcriptome [24]. This analysis gen-erated an average depth of sequence coverage of >48 Xacross over 18,000 xylem-expressed genes, and revealedextensive single nucleotide polymorphism (SNP) vari-ation, with over 500,000 putative SNPs identified [24].This work provided a rich data set for further investiga-tion of the poplar xylem transcriptome, including ex-pression levels and AS.Here, we report the use of the P. trichocarpa develop-ing xylem RNA-Seq data set [24] to investigate xylemgene expression, to query the extent and complexity ofAS in P. trichocarpa, and to investigate potential ASvariation among different individuals from different pop-ulations. Our data provide an unprecedented and un-biased evaluation of alternative splicing in this species.The RNA-Seq data confirmed most annotated splicejunctions and identified tens of thousands of novel ASevents. We found that 36% of the P. trichocarpa xylem-expressed genes are alternatively spliced. This estimate issignificantly higher than previous estimates based oncDNA/EST sequencing [26]. Additionally, we quantifiedsplice variant expression levels (isoform frequency) acrossall 20 individuals. A large number of AS events were iden-tified that had markedly different isoform ratio in individ-uals within the studied two geographically separated setsof populations.ResultsMapping of the P. trichocarpa developing xylemtranscriptomeTwenty P. trichocarpa individuals from a population ofapproximately 450 individuals from the BC Ministry ofForests collection [27,28] maintained in a common gar-den at the University of British Columbia were selectedfor xylem transcriptome analysis using the Illuminaplatform for ultrahigh-throughput RNA sequencing(RNA-Seq). The origins of the selected individuals rangefrom 59.6°N to 44.0°N (Additional file 1), including 12individuals (PT02-PT13) from southern populations(~44°- 54°N) and 8 (PT14-PT21) from northern popula-tions (~58°- 60°N). We acquired a total of more than1,660 million paired-end reads of 36-50 bp in length forxylem of the studied 20 individuals (Additional file 2). Allshort reads were aligned onto the reference P. trichocarpaBao et al. BMC Genomics 2013, 14:359 Page 2 of 13http://www.biomedcentral.com/1471-2164/14/359genome v. 2.2 at single bp resolution using SOAP software[29]. In total, 63.6% of the RNA-Seq reads were uniquelymapped onto the reference genome (Additional file 2)and, as expected, most of these (86%) mapped to annotatedexons (Figure 1A). However, a significant fraction mappedto sequences annotated only as introns (8%) and to regionswithout annotated genes (intergenic; 6%) (Figure 1A). Wedid not investigate in detail the nature of the reads thatmapped uniquely to intergenic regions, but preliminaryanalyses suggest that many are small, highly expressedtranscriptional units, some with open reading frames, thatare not obviously associated with annotated genes (i.e.,non-annotated exons of known genes) (N. Farzaneh, A.Geraldes, and C. Douglas, unpublished).To explore potential AS events, we used SOAPsplice[30] to discover splice junctions (see Methods). 8.7% of allreads were aligned on splice junctions (Additional file 2).Overall, using the total xylem transcriptome data from 20individuals, we detected 62% (118,751) of known junctionsin the P. trichocarpa genome. In addition, we identified80,345 new splice junctions not in the P. trichocarpa gen-ome annotation (see Additional file 3 for the number ofknown and new splice junctions detected in each indivi-dual). This indicates a relatively high number of AS va-riants for which no previous experimental data exists.It is now clear that measurements of gene expressionlevels from RNA-Seq are correlated with measures of ab-solute expression level (as assayed by microarray) acrossa wide dynamic range [31]. We quantified the expressionlevels of all genes in our RNA-Seq data set by determin-ing the number of fragments per kilobase of exon in agene per million fragments mapped (FPKM; [32]). Weinvestigated the expression levels (FPKM value) of all40,668 annotated genes (P. trichocarpa v. 2.2) across 20 in-dividuals and found that 24% of genes per individual hadno coverage (FPKM=0, or no reads mapped) (Figure 1B).We observed high (FPKM >40) expression for an average8% of the genes per individual. Medium (FPKM ≥5 to ≤40) and low (FPKM >0 to ≤ 5) expression levels were ob-served for 29 and 39% of the genes, respectively. Thedepth of read matches to intergenic regions and annotatedgene features is illustrated in Figure 1C. As expected, thedepth of coverage of annotated intergenic regions and in-trons was much lower than that for exonic features.Several highly expressed genes (see Additional file 4for the the complete data set of FPKM values for 41,377gene models in all 20 individuals, and for the 100 mosthighly expressed genes in 20 individuals) encode proteinsinvolved in secondary cell wall biosynthesis, including lig-nin biosynthesis (caffeic acid O-methyltransferase, COMT;caffeoyl CoA 3-O-methyltransferase, CCoAOMT; cinnamylalcohol dehydrogenase, CAD; cinnamate 4-hydroxylaseC4H; ferulate, F5H; coumarate 3-hydroxylase C3′H; hyd-roxycinnamoyl CoA:shikimate/quinate hydroxycinnamoyl-transferase; HCT; Hamberger et al., 2007), cellulosebiosynthesis (Chitinase-like 2, CTL2; KORRIGAN), andglucuronoxylan biosynthesis (glucuronosyltransferase,IRX10). Additionally, genes encoding enzymes in methionineACBFigure 1 Mapping of Illumina reads to the P. trichocarpa genome. (A). Distribution of Illumina GAII RNA-Seq reads along annotated poplarannotated genomic features (P. trichocarpa v. 2.2). (B). Summary of the average expression level across 20 individuals. (High: FPKM>40, Medium:FPKM >5-40, Low: FPKM 0–5) (C). Expression level of exon, intergenic, intron. Mean read density values for exons, introns, and intergenic regionswere computed in units of fragments per kilobase of exon/intron/intergenic (FPKM) model.Bao et al. BMC Genomics 2013, 14:359 Page 3 of 13http://www.biomedcentral.com/1471-2164/14/359metabolism (homocysteine S-methyltransferases and me-thionine adenosyltransferases), cytoskeleton components(TUB, TUA, ACT), sucrose synthase (PtrSuSY1, PtrSuSY4),and encoding several AGP/FLA arabinogalactan proteinsthat have been implicated in secondary wall biosynthesisare among the most highly expressed genes. This is consist-ent with the strong metabolic commitment of this tissue tosecondary wall biosynthesis. Interesting, among the 100most highly expressed genes, over 10% (12) encode pro-teins of unknown function, and 8 of these have no obviousArabidopsis homologs, and are poorly conserved in otherplants (www.phytozome.net). The most highly expressedgene encoding a protein of unknown function isPOPTR_0002s22040 (FPKM 1178), which has no homo-logs in other plants except cassava, which is related toP. trichocarpa. This suggests that many unknown func-tions potentially contributing to xylem development andsecondary cell wall biosynthesis remain to be discovered.Identification of alternative splicing in the xylemtranscriptomeA prerequisite for a comprehensive survey of alternativesplicing is the ability to reliably detect splice junctions.Our global survey of transcript variants based on the 20poplar xylem transcriptomes identified 199,096 splicejunctions, 118,751 of which were known junctions and80,345 were new junctions absent in the annotated gen-ome (Additional file 3). We observed, on average for oneindividual, 3.4 junctions per gene and a mean density of41.8 reads per junction. To assess the degree to whichknown and new junctions detected in the mRNA-Seq datamay represent AS events that vary between individuals,we investigated the frequencies at which splice junctionswere unique to one individual, or were found in commonto two or more individuals. Most known junctions weredetected in nearly all individuals (Figure 2A). However,among the new junctions revealed in our RNA-Seq data, a0%10%20%30%40%50%60%70%1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Fraction of  junctionsNumber of individuals known junctions new junctions 0% 10% 20% 30% 40% 50% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Fraction of AS eventNumber of individualsABFigure 2 Distribution of junctions and alternative splicing events. (A). Histograms of the distribution of known and new splice junctionsamong 20 P. trichocarpa individuals. We required splice junctions to be supported by at least two reads with a non-repetitive match position andwith a minimum of four bases on both sides of the junction. (B). Histograms of the distribution of alternative splicing events among 20individuals. Only genes with sufficient read coverage (FPKM ≥5) in all individuals were used.Bao et al. BMC Genomics 2013, 14:359 Page 4 of 13http://www.biomedcentral.com/1471-2164/14/359significant number (50%) were detected in only one indi-vidual, and few were detected in all or most individuals(Figure 2A and B). These could represent genotype-specific AS variants. It is also possible that they includelower-abundance splice variants that are widely expressedbut were only detected in a single or few individuals and/or partially processed transcripts that were only detectedin a single or few individuals.To explore potential AS events, adjacent exons werejoined into multi-exon genes via the spliced junctions. Weempirically classified the AS events into four differenttypes according to the structures of exons (see Methods).An illustrative example is shown for POPTR_0001s00260,annotated as encoding a transaldolase, in Figure 3. Thesequence coverage and junction reads clearly reflect an in-tron retention AS event at intron 11 (Figure 3). To valid-ate this result, we used RT-PCR on xylem RNA from eachof the 20 individuals with RNA-Seq data. Evidence for thepredicted intron retention isoform was found in all 20 in-dividuals tested, at abundances consistent with RNAseqread counts.For each individual, 7-11% of the total (40,668) annotatedgenes showed evidence for AS. In total, 11,232 P. trichocarpaxylem-expressed genes had evidence of AS, for a total of27,902 AS events (Tables 1 and 2; Additional file 5). Theseevents were categorized into the four known types of ASmodels, as shown in Table 2. In our data, we found that76% of all annotated genes had some level of sequencecoverage in xylem. Therefore, in total, we estimate that atleast 36% of xylem-expressed genes are alternativelyspliced. Recent evidence based on next-generation sequen-cing indicates a high incidence of AS in the human (86%of annotated genes), Arabidopsis (60%) and rice (33%) ge-nomes [1,10,11]. We found that intron retention is themost prevalent form of alternative splicing (40% of the ASevents), while exon skipping only constituted 8% and alter-native 3′ and 5′ splice sites were at intermediate frequen-cies (Tables 1 and 2). This is in contrast to mammals andhuman where exon skipping is the most prevalent mech-anism. Thus, based on previous analyses in plants, andnow in P. trichocarpa, intron retention seems to be themost common AS feature in plants.We next analyzed to what extent AS events observedin one individual were also observed the other 19 indi-viduals. Table 2 shows that the bulk of the AS variantswere observed in one or a few individuals. To excludelow abundance transcripts that could represent “tran-scriptional noise” of biological or technical origin, thesignificance of which is under debate [33], we filtered thedata to include only splice variants of genes with FPKMvalues of ≥ 5 in all individuals. As shown in Figure 2B,around 46% of AS events in this data set were individual-specific, and only 2% were conserved in all 20 individuals.Since the developing xylem tissue from individuals of theReverse strand                                 5.41 kbPOPTR_0001s00260 (transaldolase)1 2 3 4 5 6 7 8 9 10 11 12 13CoverageExon 11 98Intron 11 59Exon 12 9611 12PT02PT03PT04PT05PT06PT07PT08PT09PT10PT11PT12PT13PT14PT15PT16PT17PT18PT19PT20PT21Figure 3 Example of AS event observed by RNA-seq and RT-PCR. Top, gene model for Poptr_0001s00260 (annotated as encoding atransaldolase) from the Phyotzome browser of P. trichocarpa genome, v. 2.2 (www.phytozome.net). Exons are numbered 1–13. The 11th intron isretained in a splice variant. The average coverage is shown for the retained intron and adjacent exons in sample PT19. The mapped reads arepresented below the illustrated retained intron and adjacent exons by short black lines, and the junction reads are denoted by the broken readsbelow the gene structure models. Bottom, RT-PCR analysis of Poptr_0001s00260 transcripts in RNA samples from individual poplar trees. Arrowindicates longer RNA isoform due to intron retention.Bao et al. BMC Genomics 2013, 14:359 Page 5 of 13http://www.biomedcentral.com/1471-2164/14/359southern (PT02-PT13) populations and northern (PT14-PT21) populations were harvested in two different years(from the same common garden, same week, and sametime of day), the distribution of AS events for these twoexperiments were analyzed separately, but the distribu-tions were essentially the same as the combined analysisshown in Figure 2. This indicates a high degree of alterna-tive splicing polymorphism within this species that wouldnot be detected if only a single or few individuals had beensampled.P. trichocarpa is considered a model system for thestudy of wood (secondary xylem) development [34],which involves differentiation of vessel element and fibercells from derivatives of the vascular cambium stem cellpopulation. These cells undergo cell expansion followedmassive secondary wall deposition and lignification, andprogrammed cell death [35]. Many of the genes involvedin secondary wall formation, and transcriptional regulatorsof this process [36] are well known. To investigate the pat-tern of AS in these genes, we generated a dataset of 389cell-wall biosynthesis related and 598 xylem-expressedtranscription factor (TF) genes (Additional file 6). Wefound that 38.9% (149) of the cell-wall biosynthesis relatedgenes have evidence for alternatively splicing, for a total of424 AS events. Of the transcription factor genes, 38.7%(232) showed evidence of alternative splicing, for a total of584 AS events.An important class of secondary cell wall related enzymesare glycosyl transferases, and we found evidence for exten-sive alternative splicing of glycosyl transferase genes (with130 events in 29 genes). For example, POPTR_0005s06280,a glycosyl transferase highly expressed in developing xylem(FPKM=182) and homologous to the Arabidopsis GLU-CURONIC ACID SUBSTITUTION OF XYLAN4 (GUX4)gene involved in xylan biosynthesis [37], exhibited 16 ASevents across the 20 individuals, with 2 events commonto ≥ 15 individuals and 3 unique to a single individual(Additional file 6). POPTR_0007s04030, similar toArabidopsis GUX1, a second GUX gene highly expressedin developing xylem (FPKM =141), exhibited three ASslice forms common to all 20 individuals: two alternative3′ splice acceptor sites in the second intron, and analternative 5′ splice acceptor site in the fourth intron(Additional file 6). These data suggest that the diversity ofglycosyl transferase proteins involved in secondary cellwall polysaccharide biosynthesis could be greatly ex-panded by AS in developing secondary xylem. It is inter-esting to note that a particular feature of human glycosyltransferases is alternative splicing, which leads to variabil-ity in the N-terminal regions of the proteins [38].Among the TF genes, we found evidence for AS in mem-bers of the all TF classes examined (Additional file 6).Among these events, 42 were common to ≥ 15 individ-uals, while 242 (41%) were unique to one individual.We found 15 AS events in 12 TF genes that were commonto all individuals, including events in homologs ofArabidopsis ANAC078/NAC2 (POPTR_0010s23650),KNAT6 (POPTR_0012s08910), and MYB103 (POP-TR_0003s13190), a gene implicated in regulation of cellwall biosynthesis [39]. Interestingly, two duplicated genesTable 1 Number of AS genes and events in 20 individualsIndividual AS gene A3SS A5SS IR ESPT02 2972 1209 708 1870 257PT03 2843 1082 595 2032 227PT04 4047 1796 1003 3115 358PT05 3200 1481 837 1782 321PT06 2950 1340 767 1868 278PT07 3658 1469 795 3120 331PT08 3751 1510 894 3074 285PT09 3833 1548 938 3124 299PT10 2915 1423 815 1446 247PT11 2357 893 484 1522 174PT12 3663 1832 1048 1991 236PT13 3750 1796 1041 2093 325PT14 4088 1614 952 3176 488PT15 4584 2059 1305 3232 416PT16 3470 1720 1044 1647 306PT17 4422 1799 1102 3301 388PT18 4346 2382 1464 2517 336PT19 4567 2145 1299 3131 451PT20 4478 2471 1498 2737 355PT21 3510 1807 1161 1500 402Total 11232 8963 5545 11175 2219Fraction1 27% 32% 20% 40% 8%1Fraction of AS genes: 11232 divided by number of annotated genes (40,668).Fraction of AS events: number of each type AS events.Table 2 Summary of alternative splice events detected byRNA-SeqAlternative splice (AS) type1, 2A3SS A5SS IR ES TotalNumber AS events detected 8963 5545 11175 2219 27902% of AS events 32.1 19.9 40.0 8.0 100Number in 1 individual 4560 2873 4194 1224 12851Number in ≥ 15 individuals 572 310 651 59 1592Number in 20 individuals 195 80 134 15 424% in 1 individual only 50.1 51.2 37.5 55 46% in ≥15 individuals 6.4 5.6 5.8 2.7 5.7% in 20 individuals 2.2 1.4 1.2 0.7 1.51A3SS, alternative 3′ splice site; A5SS, alternative 5′ splice site; IR, intronretention; ES, exon skip.2Data are for all AS events detected.Bao et al. BMC Genomics 2013, 14:359 Page 6 of 13http://www.biomedcentral.com/1471-2164/14/359encoding C2H2 zinc transcription factors (POPTR_0003-s07180 and POPTR_0001s16080, both homologs of theArabidopsis SUPPRESSOR OF FRIGIDA4 (SUF4) C2H2gene, had intron retention AS forms in all 20 individuals.The SUF4 gene is alternatively spliced in Arabidopsis [40],and a similar AS pattern, retention of the last intron, isfound in both the poplar homologs and in the ArabidopsisSUF4.3 splice variant. All cell wall formation and TF re-lated genes with AS variants are listed in Additional file 6.Alternative splicing isoform variation between individualsAlthough most documented AS isoform variation occursbetween tissues or environmental conditions, differencesexist among individuals [12-14]. In addition to qualita-tively detecting AS events, Illumina-based RNA-Seq cangenerate reliable quantitative measurements of AS levels.For each of AS event type, reads derived from specificregions incorporated into an mRNA indicate quantita-tive expression of one alternative RNA isoforms. Weused an inclusion and exclusion of junction readsmethod to quantify isoform expression in the RNA-Seqdata set (see Methods).Using this approach to compare the isoform ratio ineach individual relative to the other individuals, classesof AS events were observed that had markedly differentisoform ratios between different individuals. One ex-ample is shown in Figure 4A, which depicts a scatterplotof all long (inclusion) isoform ratios of events in individ-ual PT17 against individual PT18 both from the northpopulations. Although there was an overall good correl-ation between isoform (L) ratios in the two individuals(r = 0.79), many outliers were evident. One such eventwas in POPTR_0002s21650 (predicted glycosyl hydrolasefamily 9 protein/ endo-1,4-beta-glucanase). This genecontains 4 exons and the first intron can be eitherretained or spliced out. Intron retention is the dominantisoform in PT18 but the minor isoform in PT17. RNA-Seq data indicated significantly different ratios of the L(intron retention) vs. short (S; intron spliced) in the twoindividuals (Figure 4B). This result was validated by RT-PCR (Figure 4B).For the RNA-Seq dataset we sampled developingxylem from 12 individuals taken from the southern areaof the P. trichocarpa collection, and 8 from the northernarea of the collection (Additional file 1) that were grownin a common garden. To assess AS isoform variation be-tween P. trichocarpa individuals in these populations ona global level, the correlations among the vectors of L-isoform frequencies for all detected AS events betweenpairs of samples were determined (Figure 5A). To con-trol different sequence coverage of different individuals,we also filtered the data to include only splice variantsof genes with FPKM values of ≥ 5 in all individuals. Cor-relations were clustered according to similarity usingaverage linkage hierarchical clustering. Samples from in-dividuals in north populations had much more variationthan samples from those in the southern populations,and overall, Spearman rank correlations (r) ranged fromof 0.75 to 0.87, indicating that the level of variation ob-served in the PT17 vs. PT18 comparison is common be-tween individuals sampled. One striking observation inthis analysis was the strong clustering of the 12 individ-uals from southern populations (PT02-PT13) by theirfrequency of shared isoforms. Pairs from southern popu-lation had generally higher correlations between eachother than with individuals from the northern popula-tions (PT14-PT21) and hierarchical clustering based onthese results identified two groups that correspond toAPOPTR_0002s21650Glycosyl hydrolase family 9PT17  PT18RT_PCRRNA-seq 17%   59%   isoform(L) ratioB0 0.2 0.4 0.6 0.8 1 1.2 0 0.2 0.4 0.6 0.8 1 1.2 Isoform(L) ratio of AS in PT17Isoform(L) ratio of AS in PT18r=0.79Figure 4 Differences in isoform frequency between twoindividuals. (A). Scatter plot showing all long isoform (intronretention) AS frequencies between two individuals (PT17 and PT18).Each point represents an intron retention event. (B). An example ofsignificant isoform variation in RNA specific to Poptr_0002s21650(predicted to encode a glycoside hydrolases family 9 protein)between PT17 and PT18 and RT-PCR validation of Poptr_0002s21650isoform frequency variation in PT17 and PT18. The migration of PCRproducts corresponding the long (L, intron retention) and short (S)isoforms by gel electrophoresis is shown, with the relativefrequencies was estimated by RNA-Seq data.Bao et al. BMC Genomics 2013, 14:359 Page 7 of 13http://www.biomedcentral.com/1471-2164/14/359the northern and southern populations (Figure 5A). Thissuggests that xylem RNA samples isolated from the north-ern populations are more heterogeneous in their AS vari-ation than those isolated from the southern populations.Two samples, PT18 and PT20, are from northern popula-tion individuals within close geographical proximity, yetshow the lowest AS similarity. We have recently observedthat P. trichocarpa individuals from northern populationsexhibit higher nucleotide diversity than those from south-ern populations (A. Geraldes, C. Douglas, and Q.C. Cronk,unpublished), and these trees grow in a region in whichhybridization with closely related species P. balsamiferamay occur. Thus, one explanation for this finding is thatintrogression of alleles from P. balsamifera into one orboth of these two individuals has led to higher thanexpected genetic diversity, reflected in the observed lowerthan expected isoform correlation. Alternatively, environ-mentally induced physiological variation could have af-fected AS variation differently in the northern individualsin the year the xylem samples were isolated.Next, we investigated how many AS events showedsignificant difference between individuals within andamong the population. To control the different sequen-cing coverage of different individuals, we filtered thedata to include only splice variants of genes with FPKMvalues of ≥ 5 in all individuals. We found 7.0% and13.1% (8.6%, excluding more divergent PT18 and PT20samples) of AS events respectively that are predicted tohave significant differences (FDR< 0.05) in the isoformratio between at least two individual pairs within south-ern and northern populations (Figure 5B). This was alsoconsistent with the lower correlation of isoform ratios inPT18PT20PT21PT19PT14PT16PT15PT17PT13PT12PT10PT07PT11P T04PT09PT08PT06PT05PT03PT02PT18PT20PT21PT19PT14PT16PT15PT17PT13PT12PT10PT07PT11PT04PT09PT08PT06PT05PT03PT02AB0% 2% 4% 6% 8% 10% 12% 14% within N1 within N2 within S FractionofASeventswithsignificantdifferencesbetweenindividuals0.8 0.9 1ValueColor KeyFigure 5 The extent of alternative isoform expression differences within population sets. (A). Spearman correlations of isoform (L) ratiosbetween individuals. Correlations were computed separately for each pair of individuals, and clustered according to similarity using averagelinkage hierarchical clustering. Only genes with sufficient read coverage (FPKM ≥5) in all individuals were used. (B). The fractions of AS eventsshow significant differences between at least two individuals within population sets (N1, 8 individuals from northern populations; N2, 6individuals (excluding more divergent PT18 and PT20 samples) from northern populations; S, 12 individuals from southern populations). Variationin AS events at an adjusted Fisher’s exact P-value cutoff corresponding to a false discovery rate of 5% between at least one pair of individualswithin the population set were considered significant. Only genes with sufficient read coverage (FPKM ≥5) in all individuals were used.Bao et al. BMC Genomics 2013, 14:359 Page 8 of 13http://www.biomedcentral.com/1471-2164/14/359north populations relative to southern populations(Figure 5A). We are not aware of similar analyses inplants, but they are in general agreement with an analysisof EST and cDNA data, which estimate that 21% of hu-man AS genes are affected by polymorphisms that alterthe relative abundances of alternative isoforms [41], andwith RNA-Seq analysis of 6 human individuals that es-timate that between 10% and 30% AS events showindividual-specific variation [1]. However, these frequen-cies are still below the level of ~60% variation in AS eventsamong different human tissues [1]. This observation is notsurprising, because the same tissues of different individ-uals are functionally more similar than different tissues ofthe same individuals.DiscussionTraditionally, genome-wide studies of AS were carriedby sequencing of ESTs and using exon arrays, but suchapproaches were limited by cost and ability to only effi-ciently identify common AS events [42]. It is now feas-ible to conduct deep and comprehensive sequencing oftranscriptomes in a high throughput and cost effectivemanner using next generation sequencing such as theIllumina platform employed in this study, making it pos-sible to comprehensively survey both gene expressionlevels and splicing variation, and to detect rare ASevents. In this paper we present a global analysis of geneexpression and alternative splicing in the model tree spe-cies P. trichocarpa, focusing on over 30,000 genesexpressed in the developing xylem tissue taken fromtwenty unrelated individuals. We compiled a catalog ofxylem-expressed genes, including nearly 13,000 genesexpressed at moderate to high levels (average FPKM ≥5;and 100 genes very highly expressed genes (FPKM >460; Additional file 4). Using the RNA-Seq approach, wemapped and annotated the intron-exon junctions inthese genes, which allowed us to assess splicing patternson a global level in the twenty individuals analyzed.A striking finding in our analysis, and one that dem-onstrates the power of next generation RNA-Seq to effi-ciently reveal diverse patterns of AS, is the observationthat there are significantly more new (non-annotated inthe P. trichocarpa reference genome) intron-exon junc-tions in the set of 20 unrelated individuals than weredetected in any one single individual. Many of these mayrepresent genotype-dependent AS variants and suggest aremarkable diversity in AS patterns. The bulk of the ASand AS variation we observed is likely to be biologicallyrelevant. Similar levels of AS variation and isoformabundance variation were observed in two separatelyharvested sets of developing xylem (those from the 12southern and 8 northern populations). Furthermore, se-lected examples of AS were verified by RT-PCR, manyof the events are in moderately to highly expressedgenes, many transcript variants were present at highlevels, and we found examples of AS in P. trichocarpagenes known to have splice variants in Arabidopsis (seebelow). As RNA-Seq is quantitative, it can be used to ac-curately determine isoform expression levels that resultfrom AS. In principle, it is possible to determine the abso-lute quantity of every RNA variant in a cell population,and directly compare results between experiments. Thus,in addition to revealing important new insights into alter-native splicing complexity in the P. trichocarpa transcrip-tome, our results show that RNA-Seq data can be used toreliably measure AS levels in large sample sizes.While AS in humans is common and up to 86% ofgenes have evidence of AS [1], AS in plants has not beenas extensively studied. Furthermore, relatively few plantAS events have been functionally characterized, but evi-dence suggests that AS participates in important plantfunctions, including stress responses, and may impact do-mestication and trait selection. Recent evidence based onnext-generation sequencing indicates a high incidence ofAS Arabidopsis (35% - 60%) and rice (33%) genomes[1,5,10,11]. Consistent with those levels, our findingssuggest that 36% of xylem-expressed transcripts in P.trichocarpa are alternatively spliced. This estimate of ASlevels in P. trichocarpa is much higher than previous esti-mates based on cDNA/EST sequencing [26]. As well, only3,063 out of 40,668 genes (7%) in the P. trichocarpa gen-ome (v2.2) are annotated as having at least two transcripts.It is likely that the number of alternatively spliced genesidentified in P. trichocarpa will increase with larger andmore comprehensively sampled tissue-specific tran-scriptome sequence collections. In addition, plants re-spond to the environment in diverse and complex ways,and only a small proportion of these conditions havebeen addressed in next-generation sequencing projectsthat would reveal AS.Until now, most studies of splicing variation have beenin mammals and humans. Our study adds to the plantliterature that has so far focused primarily onArabidopsis and rice, which has suggested that plantsand animals differ in the predominant AS types. As inArabidopsis [10] and rice [11], we found that intron re-tention is the most prevalent form of alternative splicingaccounting for 40% of the AS events (Tables 1 and 2). Incontrast to humans where it is the predominant form,exon skipping only constituted 8% of the P. trichocarpadeveloping xylem AS events, consistent with results inArabidopsis and rice. While recent work in Arabidopsisindicates that over 50% of genes display the non-IR AS[5], the relative prevalence of intron retention AS andpaucity of exon skipping in all three plant taxa studiedin depth (Arabidopsis, rice, P. trichocarpa) suggests thatthe mechanisms of splice site recognition and splice siteselection differ between plants and animals.Bao et al. BMC Genomics 2013, 14:359 Page 9 of 13http://www.biomedcentral.com/1471-2164/14/359We found many genes involved in cell wall biosyn-thesis and transcriptional regulation, which play keyroles in secondary xylem development, are alternativelyspliced. Future studies are needed to characterizechanges in AS over the course of secondary xylem devel-opment, and the potential functional consequences ofthe AS observed. One way of assessing functional signifi-cance is conservation of AS across different taxa. Whilethere are too few documented examples to make globalcomparisons, it is interesting to note that our initial ana-lysis showed that two duplicated P. trichocarpa xylemexpressed genes encoding C2H2 zinc transcription fac-tors (POPTR_0003s07180 and POPTR_0001s16080)share a similar AS pattern with the Arabidopsis SUP-PRESSOR OF FRIGIDA4 (SUF4) C2H2 orthologous gene[40] in all 20 P. trichocarpa individuals, retention of thelast intron as found in the Arabidopsis SUF4.3 splicevariant. Alternative splicing of INDERMINATE DO-MAIN 14 (IDD 14), which encodes another C2H2 tran-scription factor in Arabidopsis and rice, has been shownto be functionally important in a generating functionallydistinct TF heterodimer that acts as a competitive in-hibitor in regulating starch metabolism [43]. Conser-vation of alternatively spliced of genes of this class inP. trichocarpa and Arabidopsis and such functional infor-mation suggests that these splice variants may play bio-logically significant roles. Another example of thefunctional consequences of AS in Arabidopsis involves themodulation of the relative amounts of the peroxisomalversus cytosolic transthyretin-like (TTL) protein by AS,which affects ureide biosynthesis [44]. Given the metabolicand developmental complexity of secondary xylem andsecondary wall formation, it would not be surprising if anumber of the numerous AS events we observed in themRNA population of this tissue were involved in deve-lopmental and metabolic regulation. Future studies inP. trichocarpa and other woody plants must focus on thefunctional consequence of AS on proteome diversity re-lated to wood formation.We found that 7% and 13% of AS events in developingxylem showed significant differences in isoform usage be-tween individuals in separately sampled sets of individualsfrom southern (12 samples) or northern (8 samples) popu-lations, respectively. This is in general agreement with ananalysis of EST and cDNA data that estimated that 21% ofhuman AS genes are affected by polymorphisms that alterthe relative abundances of alternative isoforms [41], andan RNA-Seq analysis of different tissues in 6 human indi-viduals, which estimated that between 10% and 30% ASevents show individual-specific variation [1]. Most previ-ous AS studies have concentrated on variation in splicingpattern across tissues, which is probably controlled by theavailability of trans-acting factors in different cell types.On the other hand, where it has been studied, genotype-specific variation in splicing is caused by cis-acting SNPsaffecting the splice-site region or splicing regulatory se-quences [45,46]. Thus, the high degree of SNP poly-morphism in wild populations of P. trichocarpa [24,25]may contribute to the diversity of AS variants among dif-ferent P. trichocarpa genotypes.Interestingly, samples taken from individuals in north-ern populations had more variation between individuals(13%) than samples taken from southern population in-dividuals (7%). We have observed that P. trichocarpa in-dividuals from northern populations exhibit highernucleotide diversity than those from southern popula-tions (A. Geraldes, C. Douglas, Q. Cronk, unpublished).Thus, higher SNP diversity could contribute higher ASvariation in the xylem collected from north populationindividuals. However, it is possible that some of the dif-ference in levels of AS variation between the separatelysampled southern and northern populations is the resultof environmental rather than genotypic effects on AS vari-ation, and further work on greater numbers of RNA-Seqsamples isolated at similar times from trees grown in com-mon gardens, will be necessary to compare the AS differ-ences between southern and northern populations.High-throughput transcriptome studies are producinga fast-growing catalog of splicing variation in humanpopulations, but so far information on the functionalimpact of such splicing variation is limited, and few ifany analogous studies are available for plants. Thus, ourresults provide a new glimpse of the evolutionary andphenotypic consequences of AS variation in plant popu-lations. If the AS events conserved in 20 individualswere functionally relevant, they would be expected to beunder strong purifying selection and would thus beexpected to have significant impacts on phenotype,which will need to be tested by functional studies of se-lected examples (e.g. [43]). However, we also found alarge number of putative genotype-specific AS events atapparent low frequency levels in the population. Somevariants may be rare because they have recently arisen,or because they are deleterious and are being selectedagainst, but it is also possible that some variants play arole in adaptation and may be present in at a certain fre-quency in different populations of the species. The latterpossibility can be tested by larger scale population-widetranscriptome studies. Nevertheless, they suggest that,like SNP variants in a relatively small number of genesimportant for adaptation in P. trichocarpa [47], some ofthe AS variation in wild populations of P. trichocarpacould contribute to phenotypic variation important forlocal adaptation. However, many of these low frequencyvariants may represent neutral variation. Thus, AS vari-ation may provide a reservoir of novel exons into a minor-ity (minor-isoform) of a gene’s transcripts as suggested byModrek and Lee [48]. Because a gene’s ancestral functionBao et al. BMC Genomics 2013, 14:359 Page 10 of 13http://www.biomedcentral.com/1471-2164/14/359would be maintained by the major-isoform, this may freethe minor-isoform from functional constraint, thus redu-cing purifying selection. Thus, new exons/introns appearinginitially as minor splicing isoforms, may gradually gainfunctions over time, and became constitutive exons corre-lated with mutations that creating stronger splice sites.ConclusionsIn summary, this genome-wide analysis of alternativesplicing reveals high levels of AS in P. trichocarpa, andshows that splicing differences between individuals,including quantitative differences in isoform ratios, arefrequent in P. trichocarpa. This suggests the hypothesisthat individual-specific alternative splicing is a mechan-ism that accounts for part of individual phenotypic vari-ation in the plant populations, and several avenues areopen to testing of this hypothesis in the future. Futurestudies are needed to identify and elucidate the detailedmolecular mechanisms underlying potential splicing regu-latory SNPs and trans-factors, as well as to assess thepotential functional consequences of genotype-specificalternative splicing events. Furthermore, while the geno-types we sampled from across much of the range ofP. trichocarpa were grown in a common environment, theextent of subtle environmental variation on expressionand splicing patterns within a single genotype remains tobe explored, and will be a focus of future studies. OurRNA-Seq data also allowed us to map the transcriptionallandscape of the P. trichocarpa genome dedicated to theimportant process of wood formation, and the expressionprofiles of wood-formation related genes offer opportun-ities for functional studies of novel wood-related genes inthe future.MethodsSample collection and transcriptome sequencingThe trees sampled in this study with provenances rangingfrom 59.6°N to 44.0°N (Additional file 1) have been previ-ously described [27] and were maintained in a commongarden at the University of British Columbia. Developingxylem from twenty individuals selected for xylem tran-scriptome analysis [24] was harvested from current yearvigorously growing coppice stems the first week of July,2008 (samples PT02-PT13) or the first week of July, 2009(samples PT14-PT21) at mid-day, bark removed, and de-veloping secondary xylem tissue scraped using razorblades. The tissue was immediately frozen in liquid nitro-gen and stored at −80°C until used for RNA extraction.Library preparation and Illumina GAII sequencing werecarried out as described [24].Analysis of alternative splicing by real-time PCRTotal RNAs were isolated from developing xylem tissue,using PureLink RNA reagent (Invitrogen, USA), andcleanup and purified with Qiagen RNeasy plant mini kit(QIAGEN, USA) according to the manufacturer’s in-structions as previously described [24]. RNA was thesame set of samples used for Illumina GAII sequencinganalysis. 2 μg of total RNA was used for reverse tran-scriptase synthesis using the Omniscript RT Kit (Qiagen,USA). Gene-specific PCR primers were designed spanningthe location containing alternative splicing events. Primerpairs for POPTR_0001s00260 are 5′ ATGTCACTCTCCTTGCAATC and 3′ TCAGGCACGATCTCACTCA. Pri-mer pairs for POPTR_0002s21650 are 5′ CCAGAGGACATGACCACATC and 3′ CTATGAAGATTTCCAAAACCA. cDNA amplification using locus-specific primerpairs was done with Quick-Load Taq 2X Master Mix(New England Biolabs) using PCR condition at initial de-naturation at 95°C for 30 seconds, 40 cycles of 95°C for30 sec, 55°C for 30 sec, 68°C for 1.5 minute, and extensionat 68°C for 10 minutes.Mapping short reads to the P. trichocarpa genomeIllumina GAII RNA-Seq reads [24] were aligned to theP. trichocarpa genome v. 2.2 (http://jgi-psf.org/pub/compgen/phytozome/v6.0/Ptrichocarpa/assembly/) usingSOAP (v2.19) software [29]. The mapping criteria wereas follows: matches should be collinear to the genomeallowing up to three mismatches, but no indels.SOAPsplice (v1.1) was used for genome-wide ab initiodetection of splice junction sites from RNA-Seq [30].SOAPsplice is an effective tool for detecting not onlyknown splice junctions but also novel junctions. Allreads that could not be matched intact on the genomicsequence were searched to find a junction alignment.We required a junction site to be supported by at leasttwo reads with non-repetitive match position and also tohave a minimum of four bases on both sides of thejunction.Identification and quantification of alternative splicingusing RNA-SeqA gene was determined as alternatively spliced if therewas evidence of a 5′splice site (SS) spliced to multiple 3′SS, or of a 3′SS spliced to multiple 5′SS, or of at least1X coverage across full length of one intron (supportedby junction reads). We empirically classified the ASevents into four different types according to the struc-tures of exons. These four types include intron retention(IR), alternative 5′ splice site (A5SS), alternative 3′ splicesite (A3SS) and exon skipping (ES), as described byWang [1] and Zhang [11]. For each of AS event, readsderiving from specific regions can support the expres-sion of one alternative isoform or the other. For exampleof an A3SS event, if 12 splice junctions reads mapped toa splice junction joining a 5′SS to a 3′SS, and 4 splicejunction reads mapped to a splice junction joining theBao et al. BMC Genomics 2013, 14:359 Page 11 of 13http://www.biomedcentral.com/1471-2164/14/359same 5′SS to a different 3′SS, the first 3′SS would beconsidered Long (L)-isoform (inclusion), with a “Isoform(L) frequency” of 12 / (12+4) = 75%, as described byWang [1].Normalization of gene expression levels based on RNA-SeqGene expression levels based on RNA-Seq data were mea-sured as numbers of fragments per kilobase of exon in agene per million fragments mapped (FPKM) [32]. Meanread density values for exons, introns, and intergenic re-gions were computed in units of fragments per kilobase ofexon [intron/intergenic] model per million mapped reads,as described by Wang [1].Alternative splicing variation analysisTo assess the degree of similarity between different indi-viduals, we performed pairwise comparisons betweenevery sample pair, correlating the isoform ratios of ASevents. To exclude low abundance transcripts that couldrepresent “transcriptional noise” of biological or tech-nical origin, the significance of which is under debate[33], we filtered the data to include only splice variantsof genes with FPKM values of ≥ 5 in all individuals.Spearman correlation coefficients were computed foreach pairwise comparison, and the resulting correlationmatrix was clustered using average linkage hierarchicalclustering to generate a tree (Figure 5A). To assess pos-sible individual-specific expression for each event, aFisher’s exact test was performed to evaluate the signifi-cance of the 2x2 table in which reads were divided by: 1)individual of origin (i.e. PT02 versus PT03); and 2) readtype (i.e. long isoform versus short isoform). Variation inAS events at an adjusted Fisher’s exact P-value cutoffcorresponding to a false discovery rate of 5% (in theBenjamini-Hochberg sense) between at least one pair ofindividuals within the population set were consideredsignificant individual-specific expression.Availability of supporting dataAll sequences were deposited in the Short Read Archiveat NCBI under accession number SRA026096. All ex-pression and AS event data are found in Additional file4 and Additional file 5.Additional filesAdditional file 1: Origin of 20 individual samples.Additional file 2: Summary of mRNA-Seq read counts and mappingstatistics.Additional file 3: Number of splice junctions in 20 individuals.Additional file 4: Top 100 xylem expressed genes and xylem FPKMvalues for all v. 2.2 genes in 20 individuals.Additional file 5: List of AS events in 20 individuals.Additional file 6: Cell wall formation and TF related AS genes andevents.Competing interestsThe authors declare that they have no competing interests.Authors’ contributionsHB: Performed bioinformatic analyses, analyzed data, wrote manuscript. EL:Performed experiments. SDM: Conceived of study, contributed to datageneration, reviewed manuscript. QCBC: Conceived of study, contributed todata generation, reviewed manuscript. YAEK: Conceived of study, supervisedbioinformatics work, reviewed manuscript. CJD: Conceived of study, helpedgenerate data, analyzed data, supervised bioinformatics and experimentalwork, wrote manuscript. All authors read and approved the final manuscript.AcknowledgmentsThis work was supported by the Genome British Columbia AppliedGenomics Innovation Program (Project 103BIO). We thank Dr. JuergenEhlting (University of Victoria) for helpful discussions and Dr. MichaelFriedmann (University of BC) for project management.Author details1Department of Forest Sciences, University of British Columbia, Vancouver,BC V6T1Z4, Canada. 2Department of Botany, University of British Columbia,Vancouver, BC V6T1Z4, Canada. 3Department of Wood Science, University ofBritish Columbia, Vancouver, BC V6T1Z4, Canada.Received: 27 October 2012 Accepted: 23 May 2013Published: 29 May 2013References1. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF,Schroth GP, Burge CB: Alternative isoform regulation in human tissuetranscriptomes. Nature 2008, 456:470–476.2. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ: Deep surveying of alternativesplicing complexity in the human transcriptome by high-throughputsequencing. Nat Genet 2008, 40:1413–1415.3. Ramani AK, Calarco JA, Pan Q, Mavandadi S, Wang Y, Nelson AC, Lee LJ,Morris Q, Blencowe BJ, Zhen M, et al: Genome-wide analysis of alternativesplicing in Caenorhabditis elegans. Genome Res 2011, 21:342–348.4. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, SeifertM, Borodina T, Soldatov A, Parkhomchuk D, et al: A global view of geneactivity and alternative splicing by deep sequencing of the humantranscriptome. Science 2008, 321:956–960.5. Marquez Y, Brown JW, Simpson C, Barta A, Kalyna M: Transcriptome surveyreveals increased complexity of the alternative splicing landscape inArabidopsis. Genome Res 2012, 22:1184–1195.6. Graveley BR: The haplo-spliceo-transcriptome: common variations inalternative splicing in the human population. Trends Genet 2008, 24:5–7.7. Kwan T, Benovoy D, Dias C, Gurd S, Provencher C, Beaulieu P, Hudson TJ,Sladek R, Majewski J: Genome-wide analysis of transcript isoformvariation in humans. Nat Genet 2008, 40:225–231.8. Yeo G, Holste D, Kreiman G, Burge CB: Variation in alternative splicingacross human tissues. Genome Biol 2004, 5:R74.9. Barbazuk WB, Fu Y, McGinnis KM: Genome-wide analyses of alternativesplicing in plants: opportunities and challenges. Genome Res 2008,18:1381–1392.10. Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK,Mockler TC: Genome-wide mapping of alternative splicing in Arabidopsisthaliana. Genome Res 2010, 20:45–58.11. Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z,Fang X, et al: Deep RNA sequencing at single base-pair resolution revealshigh complexity of the rice transcriptome. Genome Res 2010, 20:646–654.12. Douglas AG, Wood MJ: RNA splicing: disease and therapy. Brief FunctGenomics 2011, 10:151–164.13. Lalonde E, Ha KC, Wang Z, Bemmo A, Kleinman CL, Kwan T, Pastinen T,Majewski J: RNA sequencing reveals the role of splicing polymorphismsin regulating human gene expression. Genome Res 2011, 21:545–554.14. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, VeyrierasJB, Stephens M, Gilad Y, Pritchard JK: Understanding mechanismsBao et al. BMC Genomics 2013, 14:359 Page 12 of 13http://www.biomedcentral.com/1471-2164/14/359underlying human gene expression variation with RNA sequencing.Nature 2010, 464:768–772.15. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, PutnamN, Ralph S, Rombauts S, Salamov A, et al: The genome of blackcottonwood, Populus trichocarpa (Torr. & Gray). Science 2006,313:1596–1604.16. Jansson S, Douglas CJ: Populus: a model system for plant biology.Annu Rev Plant Biol 2007, 58:435–458.17. Brunner AM, Busov VB, Strauss SH: Poplar genome sequence: functionalgenomics in an ecologically dominant plant species. Trends Plant Sci2004, 9:49–56.18. Cronk QCB: Plant eco-devo: the potential of poplar as a model organism.New Phytol 2005, 166:39–48.19. Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A, Amini B,Bhalerao R, Larsson M, Villarroel R, et al: Gene discovery in the wood-forming tissues of poplar: analysis of 5, 692 expressed sequence tags.Proc Natl Acad Sci U S A 1998, 95:13330–13335.20. Li X, Wu HX, Southerton SG: Comparative genomics reveals conservativeevolution of the xylem transcriptome in vascular plants. BMC Evol Biol2010, 10:190.21. Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM,Charbonnel-Campaa L, Lindvall JJ, Tandre K, Strauss SH, et al: A Populus ESTresource for plant functional genomics. Proc Natl Acad Sci U S A 2004,101:13951–13956.22. Ralph S, Oddy C, Cooper D, Yueh H, Jancsik S, Kolosova N, Philippe RN,Aeschliman D, White R, Huber D, et al: Genomics of hybrid poplar(Populus trichocarpa x deltoides) interacting with forest tent caterpillars(Malacosoma disstria): normalized and full-length cDNA libraries,expressed sequence tags, and a cDNA microarray for the study ofinsect-induced defences in poplar. Mol Ecol 2006, 15:1275–1297.23. Ralph SG, Chun HJ, Cooper D, Kirkpatrick R, Kolosova N, Gunter L, TuskanGA, Douglas CJ, Holt RA, Jones SJ, et al: Analysis of 4,664 high-qualitysequence-finished poplar full-length cDNA clones and their utility forthe discovery of genes responding to insect feeding. BMC Genomics 2008,9:57.24. Geraldes A, Pang J, Thiessen N, Cezard T, Moore R, Zhao Y, Tam A, Wang S,Friedmann M, Birol I, et al: SNP discovery in black cottonwood (Populustrichocarpa) by population transcriptome resequencing. Mol Ecol Resour2011, 11(Suppl 1):81–92.25. Slavov GT, DiFazio SP, Martin J, Schackwitz W, Muchero W, Rodgers-MelnickE, Lipphardt MF, Pennacchio CP, Hellsten U, Pennacchio L, et al: Genomeresequencing reveals multiscale geographic structure and extensivelinkage disequilibrium in the forest tree Populus trichocarpa. New Phytol2012, 196:713–725.26. Baek JM, Han P, Iandolino A, Cook DR: Characterization and comparison ofintron structure and alternative splicing between Medicago truncatula,Populus trichocarpa, Arabidopsis and rice. Plant Mol Biol 2008, 67:499–510.27. Gilchrist EJ, Haughn GW, Ying CC, Otto SP, Zhuang J, Cheung D, HambergerB, Aboutorabi F, Kalynyak T, Johnson L, et al: Use of Ecotilling as anefficient SNP discovery tool to survey genetic variation in wildpopulations of Populus trichocarpa. Mol Ecol 2006, 15:1367–1378.28. Xie CY, Carlson MR, Ying CC: Ecotypic mode of regional differentiation ofblack cottonwood (Populus trichocarpa) due to restricted genemigration: further evidence from a field test on the northern coast ofBritish Columbia. Can J For Res-Rev Can Rech For 2012, 42:400–405.29. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J: SOAP2: animproved ultrafast tool for short read alignment. Bioinformatics 2009,25(15):1966–1967.30. Huang S, Zhang J, Li R, Zhang W, He Z, Lam T-W, Peng Z, Yiu S-M:SOAPsplice: genome-wide ab initio detection of splice junctions fromRNA-Seq data. Front Gene 2011, 2:46.31. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq: anassessment of technical reproducibility and comparison with geneexpression arrays. Genome Res 2008, 18:1509–1517.32. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ,Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification byRNA-Seq reveals unannotated transcripts and isoform switching duringcell differentiation. Nat Biotechnol 2010, 28:511–515.33. Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA,Mattick JS, Rinn JL: Targeted RNA sequencing reveals the deepcomplexity of the human transcriptome. Nat Biotechnol 2012, 30:99–104.34. Mellerowicz EJ, Sundberg B: Wood cell walls: biosynthesis, developmentaldynamics and their implications for wood properties. Curr Opin Plant Biol2008, 11:293–300.35. Groover A, Robischon M: Developmental mechanisms regulatingsecondary growth in woody plants. Curr Opin Plant Biol 2006, 9:55–58.36. Zhong R, Lee C, Ye ZH: Evolutionary conservation of the transcriptionalnetwork regulating secondary cell wall biosynthesis. Trends Plant Sci 2010,15(11):625–632.37. Lee C, Teng Q, Zhong R, Ye ZH: Arabidopsis GUX proteins areglucuronyltransferases responsible for the addition of glucuronic acidside chains onto Xylan. Plant Cell Physiol 2012, 53:1204–1216.38. Gong QH, Cho JW, Huang T, Potter C, Gholami N, Basu NK, Kubota S,Carvalho S, Pennington MW, Owens IS, et al: ThirteenUDPglucuronosyltransferase genes are encoded at the human UGT1gene complex locus. Pharmacogenetics 2001, 11:357–368.39. Zhong R, Lee C, Zhou J, McCarthy RL, Ye ZH: A battery of transcriptionfactors involved in the regulation of secondary cell wall biosynthesis inArabidopsis. Plant Cell 2008, 20:2763–2782.40. Kim SY, Michaels SD: SUPPRESSOR OF FRI 4 encodes a nuclear-localizedprotein that is required for delayed flowering in winter-annualArabidopsis. Development 2006, 133:4699–4707.41. Nembaware V, Wolfe KH, Bettoni F, Kelso J, Seoighe C: Allele-specifictranscript isoforms in human. FEBS Lett 2004, 577:233–238.42. Nembaware V, Lupindo B, Schouest K, Spillane C, Scheffler K, Seoighe C:Genome-wide survey of allele-specific splicing in humans. BMC Genomics2008, 9:265.43. Seo PJ, Kim MJ, Ryu JY, Jeong EY, Park CM: Two splice variants of theIDD14 transcription factor competitively form nonfunctionalheterodimers which may regulate starch metabolism. Nat Commun 2011,2:303.44. Lamberto I, Percudani R, Gatti R, Folli C, Petrucco S: Conserved alternativesplicing of Arabidopsis transthyretin-like determines protein localizationand S-allantoin synthesis in peroxisomes. Plant Cell 2010, 22:1564–1574.45. Hull J, Campino S, Rowlands K, Chan MS, Copley RR, Taylor MS, Rockett K,Elvidge G, Keating B, Knight J, et al: Identification of common geneticvariation that modulates alternative splicing. PLoS Genet 2007, 3:e99.46. Coulombe-Huntington J, Lam KC, Dias C, Majewski J: Fine-scale variationand genetic determinants of alternative splicing across individuals.PLoS Genet 2009, 5:e1000766.47. Ma XF, Hall D, St Onge KR, Jansson S, Ingvarsson PK: Geneticdifferentiation, clinal variation and phenotypic associations with growthcessation across the Populus tremula photoperiodic pathway. Genetics2010, 186:1033–1044.48. Modrek B, Lee CJ: Alternative splicing in the human, mouse and ratgenomes is associated with an increased frequency of exon creationand/or loss. Nat Genet 2003, 34:177–180.doi:10.1186/1471-2164-14-359Cite this article as: Bao et al.: The developing xylem transcriptome andgenome-wide analysis of alternative splicing in Populus trichocarpa(black cottonwood) populations. BMC Genomics 2013 14:359.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitBao et al. BMC Genomics 2013, 14:359 Page 13 of 13http://www.biomedcentral.com/1471-2164/14/359


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items