UBC Faculty Research and Publications

The landscape of transposable elements in the finished genome of the fungal wheat pathogen Mycosphaerella… Dhillon, Braham; Gill, Navdeep; Hamelin, Richard C; Goodwin, Stephen B Dec 17, 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2014_Article_7076.pdf [ 1.91MB ]
JSON: 52383-1.0221569.json
JSON-LD: 52383-1.0221569-ld.json
RDF/XML (Pretty): 52383-1.0221569-rdf.xml
RDF/JSON: 52383-1.0221569-rdf.json
Turtle: 52383-1.0221569-turtle.txt
N-Triples: 52383-1.0221569-rdf-ntriples.txt
Original Record: 52383-1.0221569-source.json
Full Text

Full Text

RESEARCH ARTICLEThe landscape of transposulSDhillon et al. BMC Genomics 2014, 15:1132http://www.biomedcentral.com/1471-2164/15/1132better disease-control strategies. This can be achieved byUniversity, 915 W. State Street, West Lafayette, Indiana 47907-2054, USAFull list of author information is available at the end of the articleBackgroundMycosphaerella graminicola (synonym Zymoseptoria tri-tici, the causal agent of septoria tritici blotch, STB) posesa worldwide threat to wheat production, with yield los-ses of up to 30-40% or more during years with severeepidemics [1]. Although the use of fungicides and de-ployment of resistant wheat cultivars can help to containM. graminicola losses in the field, breeding for resistanceto STB has been slow and the resistance often is notdurable [2]. With the rapid evolution of fungicide resis-tance in M. graminicola populations [3,4] and failure ofresistance genes in the field [2], there is an urgent needfor improved measures to control STB.Toward this end, availability of the M. graminicolagenome, sequenced to completion by the Department ofEnergy - Joint Genome Institute (DOE-JGI) [5], is avaluable resource that may be utilized for developing* Correspondence: sgoodwin@purdue.edu4USDA-ARS, Crop Production and Pest Control Research Unit, PurdueAbstractBackground: In addition to gene identification and annotation, repetitive sequence analysis has become anintegral part of genome sequencing projects. Identification of repeats is important not only because it improvesgene prediction, but also because of the role that repetitive sequences play in determining the structure andevolution of genes and genomes. Several methods using different repeat-finding strategies are available forwhole-genome repeat sequence analysis. Four independent approaches were used to identify and characterize therepetitive fraction of the Mycosphaerella graminicola (synonym Zymoseptoria tritici) genome. This ascomycete fungusis a wheat pathogen and its finished genome comprises 21 chromosomes, eight of which can be lost with noobvious effects on fitness so are dispensable.Results: Using a combination of four repeat-finding methods, at least 17% of the M. graminicola genome wasestimated to be repetitive. Class I transposable elements, that amplify via an RNA intermediate, account for about70% of the total repetitive content in the M. graminicola genome. The dispensable chromosomes had a higherpercentage of repetitive elements as compared to the core chromosomes. Distribution of repeats across thechromosomes also varied, with at least six chromosomes showing a non-random distribution of repetitive elements.Repeat families showed transition mutations and a CpA→ TpA dinucleotide bias, indicating the presence of arepeat-induced point mutation (RIP)-like mechanism in M. graminicola. One gene family and two repeat familiesspecific to subtelomeres also were identified in the M. graminicola genome. A total of 78 putative clusters of nestedelements was found in the M. graminicola genome. Several genes with putative roles in pathogenicity were foundassociated with these nested repeat clusters. This analysis of the transposable element content in the finishedM. graminicola genome resulted in a thorough and highly curated database of repetitive sequences.Conclusions: This comprehensive analysis will serve as a scaffold to address additional biological questionsregarding the origin and fate of transposable elements in fungi. Future analyses of the distribution of repetitivesequences in M. graminicola also will be able to provide insights into the association of repeats with genes andtheir potential role in gene and genome evolution.finished genome of the fMycosphaerella graminicoBraham Dhillon1, Navdeep Gill2, Richard C Hamelin1,3 and© 2014 Dhillon et al.; licensee BioMed CentralCommons Attribution License (http://creativecreproduction in any medium, provided the orDedication waiver (http://creativecommons.orunless otherwise stated.Open Accessable elements in thengal wheat pathogenatephen B Goodwin4*. This is an Open Access article distributed under the terms of the Creativeommons.org/licenses/by/4.0), which permits unrestricted use, distribution, andiginal work is properly credited. The Creative Commons Public Domaing/publicdomain/zero/1.0/) applies to the data made available in this article,Dhillon et al. BMC Genomics 2014, 15:1132 Page 2 of 17http://www.biomedcentral.com/1471-2164/15/1132identifying and characterizing the genomic componentsthat may have an effect on the disease-causing abilitiesof the pathogen. Besides specific genes involved inpathogenicity and host specificity, intergenic regions andrepetitive sequences, especially transposable elements(TEs), also influence the structure, function and regula-tion of genes.Repetitive sequences are those that exist more than oncein a genome, and are now known to be common featuresof eukaryotic genomes. Repetitive sequences include genefamilies, pseudogenes, segmental duplications, tandem re-peats and transposable elements. Transposable elements,also known as mobile elements, are a special class ofrepetitive sequences that can move from one locus toanother in a genome, either encoded proteins required fortheir own movement (autonomous TEs) or dependent onother autonomous elements for their movement (non-autonomous TEs). During the process of TE integration ata new genomic site, a few nucleotides flanking the new in-sertion site are duplicated creating a target site duplication(TSD), which is a signature for TE insertion/excision [6].TEs can be divided into two main categories based ontheir mode of replication: Class I TEs or Retrotransposons;and Class II TEs or DNA transposons that also include theMiniature Inverted-repeat Transposable Elements (MITEs).Retrotransposons typically include coding sequences forseveral proteins including a reverse transcriptase thattranscribes the RNA to a cDNA, which is integrated backinto the genome, thereby following a copy-paste mecha-nism to move to a new genomic location. Retrotranspo-sons can be further classified as Long Terminal Repeat(LTR) retrotransposons, which carry long terminal repeatsat both ends, and Non-LTR retrotransposons, that lackLTRs but have a poly-A tail at their 3’ end. Class II (DNA-based) transposons, on the other hand, follow a cut-pastemechanism and move to a new genomic location withoutan RNA intermediate. DNA transposons typically aredelimited by terminal inverted repeats (TIRs) and encodea transposase domain. Transposon-encoded transposaserecognizes the TIRs, excises the element and integrates itinto a new location [6]. Helitrons and cryptons are alsoclassified as DNA transposons although they lack the tra-ditional TIRs. Occurrence of distinct structural featuresand protein domains can be used to identify and distin-guish between the different classes of TEs in a genome [6].Since their discovery during the late 1940s by BarbaraMcClintock [7], the perceived importance of repetitivesequences has undergone a fundamental change frombeing considered inert components of the genome todrivers of genome evolution [8]. Identification of TEs isimportant to understand their functional significance, asthey have the ability to alter the function and structureof the genome. For example, comparative genome ana-lysis revealed that the three-fold genome size expansionof the oomycete Phytophthora infestans as compared toP. ramorum was mediated by TEs [9], altering the ge-nome structure by creating gene-rich islands separatedby vast expanses of repetitive sequences. Besides affec-ting the genome structure, TEs have the ability to createnew genes [8,10] and to modulate the function of exis-ting genes to create new phenotypes [11]. An example ofthe latter phenomenon has already been documented inthe M. graminicola genome, where a single-copy DNAmethyltransferase gene was duplicated into a subtelomericregion and then amplified among the telomeres to a dozencopies, all of which were subsequently recognized by therepeat-induced point mutation (RIP) machinery and inac-tivated, including the original copy [12]. This led to a lossof cytosine methylation in M. graminicola, although theRIP machinery appears to be intact [12].A survey of the completely sequenced M. graminicolagenome was done to identify and categorize repetitivesequences, especially TEs, and to determine their chro-mosomal locations, both as independent and nestedinsertions. These data, when combined with the genomeannotation [5], will help determine the association of TEswith the genic regions and further our understanding ofTE-mediated processes that may be involved in theregulation of gene function.ResultsIdentification of repetitive sequencesThe 39.7-Mb genome of M. graminicola was sequencedto completion by the DOE-JGI; telomere-to-telomere se-quence information is available for all but chromosome18, which is missing two gaps of unclonable DNA (sizesof 1.4 and 4.5 kb), and chromosome 21, which is missingone telomere [13]. With only three gaps, it is the mostfinished genome for a filamentous fungus and comprises21 chromosomes that carry 10,952 predicted genes [13].The chromosomes have been arranged and named inorder of their decreasing lengths. The M. graminicolagenome is highly plastic as it can randomly lose up toeight of the smallest chromosomes (numbers 14–21),aptly labeled as dispensable [14].Four strategies were employed to estimate the repeti-tive content of the M. graminicola genome. Initially, only1.1% of the M. graminicola genome was identified as be-ing repetitive when using RepeatMasker (RM) [15] withthe default RepBase Update [16] repeat library contai-ning fungal-specific repeats. This led us to identify re-peats ab initio to compile and annotate a custom repeatlibrary, for which RECON [17] was used. When thiscustom repeat library was used with RM [15], discovery ofdivergent elements increased the estimated repetitivecontent of the M. graminicola genome to 16.7% (6.7 Mb)(Table 1). This repetitive fraction does not reflect the tan-dem repeats (low-complexity and simple sequence) andDhillon et al. BMC Genomics 2014, 15:1132 Page 3 of 17http://www.biomedcentral.com/1471-2164/15/1132low-copy repetitive families with fewer than ten repeatsper family; there were 2,500 low-copy repeat families intotal that accounted for 2,279,215 bp (5.7%) of the M.graminicola genome. RepeatScout [18], another tool forde novo repeat identification, was used to create a secondcustom repeat library which, in conjunction with RM [15],identified 18.7% (7.4 Mb) of the M. graminicola genomeas repetitive (Table 1). A k-mer based approach, TALLY-MER [4], was also used to identify and plot repeats acrossall of the chromosomes. The repeats predicted by differentmethods have been shown as separate tracks (Figure 1)for an essential (chromosome 8) and a dispensable chro-mosome (chromosome 14). The core chromosomes con-tain 88% of the genome and 79% of the repetitive fraction,whereas, the remaining 21% of the repeats are present onthe 12% of the genome contained in the dispensablechromosomes. In general, the dispensable set of chro-mosomes had a statistically significantly higher (t = 6.1292,P < 0.0001) repetitive content compared to the core-chromosome set (Figure 2).Repeat annotationThe repeat families identified by RECON [17] were cate-gorized into six major classes (Table 2). Based on therepeat copy number, 105 families of high-copy repeats(10 or more copies in the genome) were annotated indetail in the M. graminicola genome. However, based ontheir distribution in the genome and sequence overlap,Table 1 Predicted repetitive content in the Mycosphaerellagraminicola genomeMethod RepetitivebasesPercentrepetitiveRECON, parsed 4,629,313 11.7RepeatMasker - RepBase Update 467,778 1.2RepeatMasker - RECON, parsed 6,921,597 17.4RepeatMasker - RECON, parsed, nolow* 6,634,996 16.7RepeatMasker - RepeatScout 7,400,585 18.6*RepeatMasker nolow option, does not mask the low-complexity DNA orsimple repeats.12 families were merged with other families, decreasingthe effective number of families to 93 with copy num-bers ranging from 7 to 272 (Additional file 1: Table S1).Among the 2,500 low-copy repetitive families, only 125families occupying 525,399 bp (1.3%) of theM. graminicolagenome could be annotated (Additional file 2: Table S2).Among the different repeat classes, retrotransposons werethe most common in the M. graminicola genome. In total,21 LTR retrotransposon and four non-LTR retrotrans-poson families were identified. Class I TEs (both LTR andnon-LTR retrotransposons) in total comprised 70.5% ofthe repetitive fraction in the genome. LTRs and TSDscould only be determined for 15 LTR retrotransposonfamilies; the remaining 6 families had other characteristicsof LTR retrotransposons except for the LTR. LTR lengthsranged from 110 to 378 bp and length of the TSDs variedfrom 4–5 bp. Average insertion age for the LTR retro-transposons was estimated to be 2.4 ± 1 Million years(My), with the oldest insertion event clocked at 5.6 My(Figure 3).Unclassified repeats, those with structural featurescharacteristic of the major groups but with no similarityto protein domains or structural features associated withsub-groupings of known repeats, was the next majorcategory. Thirty-seven families of unclassified repeatswere identified that occupied 14.9% of the repetitivefraction.Fifteen families of non-MITE DNA transposons,thirteen MITE families and three families of helitrons(helicase domain-containing repeats) were also identifiedin the genome. Helitrons are atypical DNA transposonsas they lack TIRs and do not generate a TSD upon inte-gration. Three helitron-associated structural features,i.e., dinucleotide TA at the 5’ end, hair-pin loop followedby the tetranucleotide ‘CTRR’ at the 3’ end, were presentin two families (families 98 and 765); only the hair-pinloop could be identified in the third family (family 637).DNA transposons in total accounted for 14.6% of the re-petitive DNA. Among the thirteen MITE families disco-vered, nine had 2-bp TSDs (TA: 8 families; CT: 1 family),two had 5 - 6-bp TSDs and one family had an 8 - 9-bpTSD. Based on having the same TIR and TSD butdifferent internal sequences, four (families 55, 112, 222and 290) and three MITE families (246, 298 and 2546)could be grouped into two superfamilies.Distribution of repeats across the chromosomesA non-parametric runs test was used to check the ran-domness of repetitive sequence distribution across theM. graminicola chromosomes. Chromosomes were ini-tially partitioned into bins of 100 kb each. Each bin wasscored as 1 or 0, if the repetitive content of the bin washigher or lower, respectively, compared to the mean re-peat content of the chromosome. A survey of these 100-kb bins showed that repetitive sequences on two corechromosomes, 8 and 10, had a nonrandom, clustereddistribution. Because the number of runs was below thethreshold value of 10 for all of the dispensable chromo-somes and one core chromosome (number 13), we re-peated the survey for smaller bins (50 kb). Using 50-kbbins, four additional chromosomes (core chromosome 6and dispensable chromosomes 17, 18 and 19) showed anon-random distribution of repetitive sequences (Table 3;Additional file 3: Table S3). The distribution of genesalso was tested in the smaller bins and showed a non-random pattern on chromosomes 8 (core chromosome)and 15 (dispensable chromosome).Dhillon et al. BMC Genomics 2014, 15:1132 Page 4 of 17http://www.biomedcentral.com/1471-2164/15/1132Randomness of inter-element distances also was tes-ted. Using the Shapiro-Wilk test, the null hypothesis thatrepeats are distributed randomly was rejected and theP values were highly significant for all M. graminicolachromosomes (results not shown). Therefore, this ana-lysis does not seem to discriminate as well as the non-parametric means test.Nested blocks of elementsRepetitive elements inserted into other elements are de-scribed as nested. A nested cluster is initiated when a TEinserts (primary insertion event) into the base element(element in which the primary insertion occurs). Therecan be multiple primary insertions into the base ele-ment, as a new independent insertion event can occuradjacent to the existing insertion. A total of 78 putativenested element clusters was identified in the M. gramini-cola genome, with 69 (88%) clusters present on the corechromosomes. Their length ranged from 7.5 to 63 kb,with 33 clusters being greater than 20 kb. These clustersFigure 1 Repetitive content on representative core (Chromosome 8) agraminicola. Repetitive sequences in the M. graminicola genome were ideRepeatScout and TALLYMER. In addition to the default RepBase Update repcustom repeat libraries contained repetitive elements identified by RECON anA sharp decrease in percent GC content coincides with the occurrence of repcompared for repeat coverage on both chromosomes. RM-RepBase Update lilibrary (blue bars) performed better than RECON (green bars) alone. RepeatScred bars) was equivalent or better than that of the RM-RECON combination. Paccount for about a quarter (24%) of the total repetitivecontent in the genome. The number of different elementsin a cluster varied from 2 to 7. An LTR retrotransposonwas the base element in 49 clusters (Figure 4), out ofwhich TSDs were verified in 35 clusters (Table 4). In eightclusters (length greater than 20 kb) with a DNA trans-poson or an unclassified element as the base element,TSDs were verified for the primary LTR retrotransposoninsertion event. In general, the primary insertion eventwas dominated by LTR retrotransposons, with 59 suchcases identified. One non-LTR retrotransposon family(family 623), found in 30 primary insertion events, wasnoticeably absent from all nested structures with a DNAtransposon as the base element.Genes also were found associated with the clusters oftransposable elements. Sixteen genes (including kinases,peptidases and cytochrome c) were found inserted in theclusters and 60 genes were found in the close vicinity ofthese nested clusters (within 2 kb of the cluster boun-dary). Eleven of these genes contained signal peptidesnd dispensable (Chr. 14) chromosomes of Mycosphaerellantified using four repeat-finding methods: RepeatMasker (RM), RECON,eat library, two custom repeat libraries were used with RM. Thesed RepeatScout. Percent GC content was plotted for each chromosome.etitive sequences on both chromosomes. Different methods can bebrary (yellow bars) had the poorest coverage of repeats. The RM-RECONout output is not shown here. The performance of TALLYMER (bottomredicted genes are indicated by red bars.Dhillon et al. BMC Genomics 2014, 15:1132 Page 5 of 17http://www.biomedcentral.com/1471-2164/15/1132and were potentially secreted, and others included genesencoding enzymes such as hydrolases, kinases, cytochromes,and chloroperoxidases.Subtelomeric regionsThe length of a subtelomere was defined as a continuousstretch of repetitive DNA starting from the telomericrepeats, without any intervening unique DNA. Differentchromosomes had varying lengths of subtelomeric repeats(Figure 5). The length of subtelomeric regions varied from1.6% (chromosome 20) to 91.4% (chromosome 1) of the100-kb sequence that was analyzed from the end of eachchromosome. Although the repeats in subtelomeric re-gions did not show a biased distribution, some chromo-somes had subtelomeric regions that were highly similarto sub-telomeres on other chromosomes. Long stretchesof highly similar subtelomeric sequences were shared bet-ween three pairs of core and dispensable chromosomes,chromosomes 2 and 17 (58 kb), 9 and 21 (19 kb) and 4and 18 (7 kb), and one pair of core chromosomes, 8 and13 (37 kb).Figure 2 The distribution of different classes of repetitive sequencescontributions of six major repeat classes to the total repetitive content onrepetitive content on each M. graminicola chromosome. The dispensable ccompared to the core chromosomes.Two repeat families were limited to subtelomericregions only. One was a family of non-LTR retrotran-sposons and the other was a repeat family containinga DNA methyltransferase domain [12], atypical of re-petitive elements. The non-LTR retrotransposon family(family 4) that was limited to sub-telomeric regionscontained 34 full-length elements and 49 truncatedelements. Depending on the number of tandem repeatcopies in the non-LTR element, the full-length elementswere 3.4 - 3.5 kb long. These full-length elements werepresent on 16 out of the 21 chromosomes and on 26 ofthe 41 chromosome ends. Thirteen cases were identifiedon different chromosomes where a truncated elementwas found in tandem to the full-length element. One tothree copies of the telomeric hexamer repeat TTAGGGwere present at the 3’ end of all the full-length and 38truncated elements. The telomeric repeat-containing 3’end of the non-LTR element was always pointing awayfrom the telomere. Therefore, the orientation of theelements on one end of the chromosome was the samewhereas elements on the other chromosome end wereacross the Mycosphaerella graminicola chromosomes. Theeach M. graminicola chromosome are shown. Inset: The percenthromosomes have a higher percentage of repetitive sequences asTable 2 Classification of Mycosphaerella graminicola repetitivClass Repeat class (# families) Repeat type Total lClass I LTR Retrotransposon (21) Ty1-CopiaTy3-GypsyUnclassifiedSubtotalNon LTR Retrotransposon (4) Tad1-likeUnclassifiedSubtotalClass II DNA Transposon (15) CACTA En/SpmhATTc1-MarinerTc5/PogoUnclassifiedMITEs (13)Helitron (3)SubtotalUnclassified Unclassified (37) UnclassifiedTotal*These repeat estimates were calculated using families containing at least 10 membFigure 3 LTR retrotransposon insertion age in the Mycosphaerellagraminicola genome. Insertion age for 149 elements from differentM. graminicola LTR retrotransposon families was calculated using thenucleotide divergence between intact LTR pairs. The probableinsertion age of each element is represented by square dots.Dhillon et al. BMC Genomics 2014, 15:1132 Page 6 of 17http://www.biomedcentral.com/1471-2164/15/1132present in the opposite orientation. Telomere-associatedrepetitive elements with similar characteristics, i.e., con-taining a reverse-transcriptase gene and terminating intelomeric repeats at their 3’ ends, have been describede families identified de novo by using RECONength (bp)* Percent of repetitive content Percent of genome1,039,062 15.7 2.61,803,457 27.2 4.5766,193 11.5 1.93,608,712 54.4 9.1697,083 10.5 1.8371,533 5.6 0.91,068,616 16.1 2.778,909 1.2 0.270,346 1.1 0.258,977 0.9 0.144,941 0.7 0.1228,727 3.4 0.698,726 1.5 0.2390,961 5.9 1.0971,587 14.6 2.4986,081 14.9 2.56,634,996 16.7ers.previously as Penelope-like elements (PLEs) [19].Besides telomere-associated repeats, one class oftelomere-associated gene also was found (Figure 5). Ini-tially, a large open reading frame (ORF) of ~ 3.2 kb wasidentified in the subtelomeric region of chromosome 12.Upon further annotation, the putative coding sequencewas 3,480 bp long with a very low GC content of 34.7%.Similarity searches identified 13 copies (7 full length, 6truncated) on nine chromosomes in the genome. Asearch of the NCBI non-redundant database revealedsimilarity to subtelomeric RecQ helicase from Schizo-saccharomyces japonicus (at 5e-11). One pair of RecQhelicases was a part of the above-mentioned stretch ofsimilar subtelomeric sequences between core chromo-somes 8 and 13.Tandem repeatsBased on their repeat unit lengths, tandem repeats arebroadly classified into three categories: microsatellites(1–6 nucleotides); minisatellites (7–100 nucleotides); andsatellites (>100 nucleotides). All dispensable chromosomeshad a higher percentage of tandem repeats. An increase inminisatellite content contributed to this difference betweencore and dispensable chromosomes (Figure 6). Chromo-somes 9 and 13 had an unusually higher percentage ofsatellite repeats as compared to other core chromosomes.As tandem repeats often occur in genes, tandem re-peat content of both unmasked and masked M. gramini-cola genome was compared. In the unmasked sequence,225 tandem repeat families (7,509 repeats) covering556 kb of the genome were found. The repeat unit sizevaried from 1 to 1950 bp. On the other hand, the num-ber of tandem repeat families in masked sequence wasreduced to 195 (5,109 repeats) and the repeat unit sizevaried from 1 to 479 bp. Tandem repeats in the maskedgenome covered a total of 374 kb with 107 kb (29%)being a part of the M. graminicola gene space. Thus,Table 3 Randomness of repetitive sequence distribution alonnon-parametric runs testChrStatistic 1 2 3 4 5 6 7 8 9 10n 122 78 70 58 58 54 54 49 43 34n1 35 26 23 17 24 22 18 18 17 12n0 87 52 47 41 34 32 36 31 26 22Runs 55 36 31 21 29 22† 19 15† 19 11†P value 2.7 e−2 1.9 e−6 1.3 e−4Each chromosome was divided into non-overlapping 50-kb bins. Repetitive contentand each bin was scored either 1 (bin repetitive content greater than chromosomaconsecutive occurence of 0 s and 1 s was calculated as a run. n, the total number ozeros; Runs, number of consecutive occurrences or runs of ones and zeros. †P < 0.0.Dhillon et al. BMC Genomics 2014, 15:1132 Page 7 of 17http://www.biomedcentral.com/1471-2164/15/1132Figure 4 A nested cluster of transposable elements onMycosphaerella graminicola chromosome 6. A 47-kb nestedcluster with an LTR retroelement as the base is indicated. It has a5-bp target-site duplication (TGGAA). Each triangle represents oneinsertion and the numbers inside the triangle refer to the repeatfamily. This cluster only had class I elements. These elements havebeen shaded as follows: light grey, LTR retrotransposons; dark grey,Non-LTR retrotransposons. The bottom panel shows the percent GCplot for the corresponding repeat region. A decrease in percent GCcontent in the repeat region is evident when compared to theflanking DNA sequence.about one-third of the tandem repeats identified inthe masked genome were components of genes, as therepetitive fraction was masked out prior to analysis.Extent of repeat-induced point mutations (RIP)All repeat families showed elevated levels of transitionmutations indicating RIP (Additional file 1: Table S1). Aclear CpA → TpA dinucleotide bias was detected in 70repeat families (Figure 7), while the remaining 24 fami-lies (10 MITEs and 14 unclassified repeat families) failedto show a specific dinucleotide bias. Although the ana-lysis only looks at one strand, both C → T and G → Apolymorphisms on that strand were quantified to ac-count for RIP on the complementary strand. All of theunclassified repeats and these ten MITE families wereless than 200 bp in their average match length. A pair-wise comparison of repeats using the highest-GC ele-ment in each family as a reference sequence showed thatthe identity between elements in the same family rangedfrom 83 to 99%. However, two 6.7-kb sequences wereidentified on chromosome 7 that were 100% identical.These sequences contain the rDNA repeats, only twocopies of which could be assembled in the releasedgenome sequence.g each Mycosphaerella graminicola chromosome using aomosome11 12 13 14 15 16 17 18 19 20 2133 30 24 16 13 13 12 12 11 10 910 13 9 6 6 5 4 4 6 6 423 17 15 10 7 8 8 8 5 4 513 13 7 10 7 6 5† 7† 5† 4 61.5 e−2 1.5 e−2 2.7 e−2in each bin was compared to the chromosomal average repetitive contentl average) or 0 (bin repetitive content lower than chromosomal average). Eachf observations; n1, count of occurrences of ones; n0, count of occurrences ofActive elementsAlmost all elements analyzed carried numerous muta-tions suggesting the presence of a RIP-like mechanismin M. graminicola. However, elements belonging to atleast one family of non-LTR retrotransposons (family623) and one family of copia-type LTR retrotransposons(family 18) were found in the genome that carried mini-mal RIP-signature mutations.Using the reverse transcriptase domain from the non-LTR retrotransposon, at least 119 similar sequences weredetected in the genome. All of these sequences, except11, carried one or more stop codons. The remaining 11sequences also had many transition mutations, but noneof them resulted in a stop codon in the reverse-transcriptase domain. These 11 elements had 90-95%ph)sizDhillon et al. BMC Genomics 2014, 15:1132 Page 8 of 17http://www.biomedcentral.com/1471-2164/15/1132Table 4 Nested clusters of repetitive elements in the Mycosas base element and a verified target site duplication (TSDChr Cluster start Cluster end Base family Cluster1 910,151 924,049 491 3,757,104 3,770,311 115sequence identity with each other and their RT domainswere 97.5 - 99.9% identical. Also, a 12-bp TSD was iden-tified for eight of the 11 sequences. This non-LTR retro-transposon is specific to fungi and belongs to the Tad-1clade of non-LTR retrotransposons. A total of 22 repeat-derived EST sequences had 100% identity to 21 elementsfrom this non-LTR family. Another LTR retrotransposon(family 242) was also represented in the EST dataset.2 162,839 175,222 2422 811,149 825,327 2422 1,675,912 1,693,177 152 1,868,450 1,894,757 2632 3,605,891 3,620,926 153 575,530 603,863 9403 787,112 832,923 4953 1,573,516 1,594,725 4953 2,416,365 2,433,023 154 1,534,719 1,552,726 1004 1,567,363 1,594,101 4915 192,764 204,730 1656 191,008 210,165 2426 420,117 467,482 4916 771,418 802,180 4916 2,542,200 2,570,636 1007 1,261,818 1,284,705 6337 1,787,729 1,808,698 1657 1,821,242 1,835,308 6037 2,535,616 2,549,001 2429 64,037 77,995 499 79,932 109,217 6039 962,581 977,124 499 1,052,214 1,078,572 24210 863,891 879,094 63310 1,306,971 1,330,127 16512 609,353 623,337 94013 1,028,648 1,051,790 10014 411,612 433,895 60314 587,485 599,441 16515 192,089 204,601 24216 510,094 526,146 1518 294,431 312,546 242Columns: Chr, Chromosome. §The brackets represent the level on nesting.aerella graminicola genome with an LTR retrotransposone (bp) Target site duplication Internal family§13,898 GGTAG 1513,207 CCTAC 263The LTR retrotransposon family with minimal evidenceof RIP had only one complete copy that was present onchromosome 18. The element was 5,545 bp long with515-bp LTRs and a 4-bp TSD (ACTT) (Additional file 2:Table S2). LTR alignment showed two mismatches, bothtransition mutations, which indicates a recent trans-position event, as LTR sequences keep accumulatingmutations with age. There was one truncated copy on12,383 CTTAG 4914,178 CTGTG 62317,265 CGAC 24226,307 AAGGT/C (2477 (495, 623))15,035 ACAGG 62328,333 ATTGA (263 (623, 115))45,811 CTAC (2477, 623, 633 (623), 49 (115))21,209 CCAG (1045, 2477)16,658 CTGC 62318,007 ATTAG 60926,738 ATAC (603, 623)11,966 TTTTG 62319,157 ATGA (623, 242)47,365 TGGAA (263 (623), 623 (495))30,762 GGATG/GGCCG 623 (x3)28,436 CATTG (623 (495))22,887 GAGCT 49520,969 GCATC 49514,066 GCATA/G 62313,385 ATATC 62313,958 GATAG 62329,285 CATTGG 62314,543 CCAAT 100826,358 TATGA (49 (263 (623)))15,203 GCTGC 24223,156 AGATA (623 (623), 623)13,984 GCTTC 4923,142 G/CAAAG 49522,283 TTAAG 49511,956 CGGTT 62312,512 GCTTT 62316,052 TTCT 26318,115 GAAGA (623, 623)ectraontDhillon et al. BMC Genomics 2014, 15:1132 Page 9 of 17http://www.biomedcentral.com/1471-2164/15/1132Figure 5 Genes and repeats in Mycosphaerella graminicola subtelomchromosome was analyzed. The numbers between the bars refer to the1 kb interval. All of the elements have been color coded: orange, LTR retrobrown, MITEs; cyan, Helitrons; lime green, Unclassified; dark blue, repeats cred, predicted genes.chromosome 17 and two soloLTRs, one each on chromo-somes 8 and 12.DiscussionRepeat identification strategiesWith the availability of faster and cheaper sequencingtechnologies, the real challenge is not sequencing a ge-nome but its annotation. Early identification and analysisof repetitive sequences sets the ground for a betterannotation of the genic and inter-genic regions. Severalapproaches were used for repetitive sequence analysis inthe completed M. graminicola genome. Search strategiesthat exploit structural features to identify and classifyTEs in sequenced genomes are known as similarity-based methods. RepeatMasker [15], a similarity-basedapproach, identifies repetitive elements based on theirsimilarity to already known repetitive sequences. Besidessimilarity-dependent methods, two other approaches, denovo and k-mer based, can be used to define repetitiveelements in a genome. Methods such as RECON [17]utilize whole-genome alignments to identify repetitiveelements de novo. K-mer-based methods such as Re-peatScout [18] and TALLYMER [20] calculate the fre-quency of oligomers of different lengths to delineaterepetitive sequences in the genome. Since the k-mer andde novo methods do not rely on the existing repeat data-sets, they are also suitable for identifying novel repeatsin a genome.res. A 100-kb subtelomeric sequence from both ends of eachhromosome number. The ruler below chromosome 21 marks everynsposons; royal blue, Non-LTR retrotransposons; pink, DNA transposons;aining DNA methyltransferase gene sequence; dark green, RecQ genes;With the default RepBase Update [16] repeat datasetcomprising all of the fungal-specific repetitive elementsonly ~1% of the M. graminicola genome was found to berepetitive. This indicates that the RepBase Update [16]fungal repeat dataset has very low coverage of fungal-specific repetitive sequences. RepBase Update [16] haslimited sequence information with 256 sequences from 56fungal species. This low repeat estimate may also reflectthat most M. graminicola repetitive sequences are unique,as they were not found in other species. A similaritysearch of thirteen sequenced fungal genomes could onlyidentify less than 1% of the repetitive sequences present inM. graminicola.RM [15] was used in conjunction with repeat librariesderived from two repeat-finding approaches, RECON[17] and RepeatScout [18]. Of these two, the RECON[17] repeat library with a cutoff for high-copy repeats of10 members/repeat family as used previously [17] waschosen for the subsequent in-depth analysis of therepetitive content of the genome to keep the number offamilies analyzed in detail reasonable. However, low-copy repeat families (with 2–9 copies in the genome)also were annotated, of which only 125 families (0.05%)could be assigned a useful annotation (Additional file 2:Table S2). The output from RECON [17] is primarilyused for first-pass classification of repeats in newlysequenced genomes [17]. A comparison of the repetitivefraction identified by RECON [17] alone to RECONDhillon et al. BMC Genomics 2014, 15:1132 Page 10 of 17http://www.biomedcentral.com/1471-2164/15/1132followed by RM [15], revealed that RM was able to iden-tify similar but divergent elements that were missed byRECON (Table 1). RepeatScout [18] output reports aconsensus sequence for each family, whereas RECON[17] extracts individual elements for each family, whichnot only makes the annotation step easier, but also ac-counts for the divergent members of the repeat families.Therefore, even though RepeatScout [18] predicted aslightly higher proportion (1.9% more) of the M. grami-nicola genome as repetitive, we decided to annotate andreport the RECON [17] output. The final repetitive frac-tion of 16.7% reported for M. graminicola is a conserva-tive estimate as it excludes the low-complexity regions,simple repeats and repeat families containing fewerthan 10 members/family. Adding all of those togetherestimated the total repetitive fraction at around 22.4% ofthe genome. The results from TALLYMER [20] wereused for visualization purposes and correlated very wellwith the results from the other two methods (Figure 1).A decrease in percent GC content along the chromo-some paralleled the occurrence of repetitive elements(Figure 1).Relative insertion ages of 149 LTR retrotransposonsthat contained intact LTRs were estimated based on theFigure 6 Distribution of tandem repeats in the Mycosphaerella graminminisatellites (7–100 bp) and satellites (>100 bp) across the M. graminicolaoccurrence of nucleotide substitutions between the 5’and 3’ LTRs. However, the insertion timings may havebeen overestimated due to an elevated mutation rate asa consequence of RIP. Among the 149 retrotransposonswith intact LTRs, only one element was found withidentical sequences (insertion age of 0 My). Multiple se-quence alignments of this element with other membersof its family show mutated/RIPed sites in the internal re-gion of this element but the LTRs themselves somehowescaped the effects of such mutations. This is interestingas this element would be an ideal candidate for futurestudies to investigate the timing and extent of RIP insubsequent cycles of sexual reproduction.Distribution of repetitive sequencesAnalysis of a single chromosome (chromosome 7) inMagnaporthe oryzae revealed a non-random distributionof repeats, with repetitive sequences occurring in threeclusters mostly in heterochromatic regions near the telo-meres [21]. This pattern however did not hold true in thecompletely sequenced M. graminicola genome, wherethe actual distribution of repetitive sequences across thewhole genome and all of the chromosomes could beascertained. Repetitive sequences in the M. graminicolaicola genome. Extent of microsatellites (1–6 base pair repeat unit),chromosomes.Dhillon et al. BMC Genomics 2014, 15:1132 Page 11 of 17http://www.biomedcentral.com/1471-2164/15/1132genome showed a non-random distribution across sixchromosomes but were random for the remaining 15.Distribution of repetitive sequences and of transpos-able elements in particular can be random or clustereddepending upon the insertion preferences of the diffe-rent elements. For example, LINEs and SINEs in rodentsand humans show an insertion preference to distinctchromosomal domains, leading to differences in theirdistribution patterns in their host genomes [22,23]. Ourresults show that both random and non-random distri-butions occur in the M. graminicola genome. The nestedinsertions primarily consisted of LTR-retrotransposonsnon-randomly clustered together in the gene-poor het-erochromatic regions, as opposed to the smaller DNAtransposons (MITEs) that were more prevalent in thegene-rich, euchromatic regions. Repeat-rich regions(AT-blocks), accounting for 36% of the genome and 5%of the gene space, are randomly distributed across theL. maculans supercontigs [24], whereas analyses ofMITEs in the Epichloë genome revealed an insertionFigure 7 Dinucleotide bias for the Mycosphaerella graminicola repeatTpA) was detected in the repeat families comprising Class I and Class II tranrepeats showed no specific dinucleotide bias. The different dinculeotide biCpC↔ TpC; green, CpG↔ TpG; cyan, CpT↔ TpT. RIP dominance, in a givthe three other CpN↔TpN mutations. The arrows are shown bidirectionally(e.g., a C → T mutation on one strand will be seen as G → A on the comppreference in the 5’ regions of the genes [25]. A dissi-milar distribution pattern of repetitive elements suggeststhat different TEs have evolved distinct strategies topersist in the genome.The set of eight dispensable chromosomes was statisti-cally significantly enriched in transposable elements andtandem repeats as compared to the core chromosomes.Dispensable chromosomes, also known as B or super-numerary chromosomes, were enriched for transposableelements in other fungi, including Fusarium oxysporum[26] and Nectaria haematococca [27]. However, unlikethe M. graminicola genome, dispensable chromosomesin these other species were enriched for genes thatplayed a major role in plant pathogenesis.Types of repetitive sequencesAmong all of the classes of repetitive elements, retrotran-sposons occupied the largest fraction in theM. graminicolagenome. As retrotransposons follow a copy-paste mecha-nism for replication, these elements have usually been thefamilies identified by RECON. A specific dinucleotide bias (CpA↔sposable elements. Ten families of MITEs and 14 of unclassifiedases that were evaluated are color coded: red, CpA↔ TpA; blue,en alignment, is the ratio of a particular CpN↔TpN to the sum ofto indicate that changes on both strands of DNA are analyzedlementary strand).Dhillon et al. BMC Genomics 2014, 15:1132 Page 12 of 17http://www.biomedcentral.com/1471-2164/15/1132major contributors to the repetitive fraction across a largenumber of genomes analyzed so far. The role of Class I ret-rotransposons in genome size inflation has been studiedextensively in various organisms. In plants, a single familyof LTR retrotransposon, BARE-1, is positively correlated tobarley genome size increase [28], whereas genome sizedoubling of the wild rice Oryza australiensis was attributedto three LTR retrotransposon families that accounted for60% of the genome [29]. Retrotransposons are also impli-cated in genome size expansion in fungi and oomycetes.The two Dothideomycetes relatives of M. graminicola withexpanded genomes, Cladosporium fulvum [30] and M.fijiensis [31], had higher proportions of LTR retrotranspo-sons in their genomes. The genome of the powdery mildewpathogen Blumeria graminis is four times larger than theaverage ascomycete genome; this difference can be attri-buted to non-LTR retrotransposons [32]. Similarly, in theoomycete P. infestans, two LTR retrotransposon familiesaccount for 29% of the 240-Mb genome [9].The M. gramincola retrotransposon families showedvarious stages of decay. There were six LTR retrotrans-poson families in which LTRs or TSDs could not beidentified. The remaining LTR retrotransposon familiesalso had truncated elements. Fragmented sequences thatlack any retrotransposon structural features but showsimilarity to partial retrotransposon sequences have beentermed ‘remnants’ [33]. Such remnants with small orlarge deletions have been observed in other organisms.Various mechanisms, such as illegitimate recombinationin Arabidopsis [34], Saccharomyces cerevisiae [35] andDrosophila [36] and unequal homologous recombinationin plants [33], are responsible for deletions in repetitiveelements and tend to counteract the repetitive elementexpansion in the genome. The extent of remnants in afamily is correlated to the age of the family [33], with olderfamilies having a larger number of truncated elements.Three putative helitron families carrying helicasedomains were identified but only two families had thehallmark structural features associated with helitrons. Oc-currence of truncated elements in the third family madethe identification of structural features impossible. Heli-trons have been detected in other ascomycete fungi, suchas Aspergillus nidulans [37] and Dothistroma pini [30]. Anexpansion of helitrons was observed in the P. infestansgenome, which had 10-fold higher copies than the relatedP. ramorum and P. sojae genomes [9]. As compared toMITEs, TIRs and TSDs could not be identified for any ofthe DNA transposon families, which suggests that thesetransposon families may be old. Crypton, a class of DNAtransposons found only in fungi [38] was also present inM. graminicola. These crypton sequences were initiallyidentified by RECON [17] but were not included in theparsed set as they were below the family cut-off thresholdof at least 10 elements.Telomeres, RecQ helicase and putative Penelope-likeelementsThe RecQ helicase gene family, with extremely low GCcontent and tightly linked to the M. graminicola telo-meres, has previously been identified in other fungalgenomes including M. grisea [39], S. cerevisiae [40] andUstilago maydis [41]. RecQ helicases are involved in avariety of functions including post-transcriptional genesilencing [42], maintaining genome stability [43] and DNArepair and recombination [44]. In S. cerevisiae, RecQ heli-case is a part of the telomere-localized Y’ element [45],which has been shown to play a structural role in pro-tecting telomeres from accidental shortening or damagefrom recombination-mediated mechanisms. In S. cerevi-siae strains lacking a gene for telomere replication, ampli-fication and acquisition of Y’ elements in a large numberof telomeres decreases the frequency of cell death [46].Besides a structural role, yeast RecQ helicase expression isinduced during meiosis, suggesting a role in meiosis orsexual development [40].Penelope-like elements (PLE), characterized by a bac-terial GIY-YIG endonuclease domain, have a widespreaddistribution across the tree of life including other fungi,rotifers, plants and heterokonts [47]. One family oftelomere-localized PLEs was identified in the M. grami-nicola genome. These putative PLEs are a subtype thatlocalize to the subtelomeric regions in a characteristicorientation with telomeric repeats at their 3’ ends [19].Elements of this subtype lack the endonuclease domainbut instead have species-specific telomeric repeats attheir termini that can pair with the single-strandedtelomere and utilize the 3’ hydroxyl group for primingsequence synthesis [19]. Thus, these telomeric PLEs mayplay a role in protecting telomeres in addition to theregular telomerase enzyme, which in M. graminicola islocated on chromosome 7. Telomeres in Drosophila [48],silkworm [49] and Giardia lamblia [50] are also main-tained by non-LTR elements, but they carry an endo-nuclease domain. The telomeres in these three speciesconsist of long arrays of non-LTR elements, unlike theendonuclease-lacking PLEs which are low-copy repeats.RIP in repetitive sequencesDue to their inherent ability to amplify, defense mecha-nisms exist in the genome to minimize the numbers oftransposons. One such mechanism that is specific tofungi is Repeat-Induced Point mutation (RIP). RIP tar-gets multiple-copy sequences during meiosis and intro-duces cytosine (C) to thymine (T) mutations [51]. Sexualreproduction in the field is a common phenomenon inM. graminicola and ascospores have been shown to playa major role in long-distance disease spread and initi-ation [52]. Therefore, it seems highly likely that repeti-tive sequences in M. graminicola can be targeted by RIPDhillon et al. BMC Genomics 2014, 15:1132 Page 13 of 17http://www.biomedcentral.com/1471-2164/15/1132during meiosis. Although RIP was not verified expe-rimentally, a detailed analysis for all the repeat familiesrevealed a higher number of transition mutations,supporting the existence of a RIP-like mechanism in M.graminicola, as noted previously [12,13]. However, ourresults were based on a much larger and global datasetas compared to the previous reports.Certain dinucleotides are preferentially targeted byRIP, known as RIP bias, which varies with the organism[53]. Repetitive sequences in M. graminicola exhibited aCpA dinucleotide bias, the same as seen in Neurosporacrassa [53]. However, ten families of MITEs and 14families of unclassified elements did not show any cleardinucleotide bias. Repeat elements in these families wereshorter than the minimum length threshold detected bythe RIP machinery. RIP typically acts on sequences thatare longer than 400 bp in length [51] and are at least80% identical [54]. Many families contained truncatedelements and fewer full-length elements that might haveprevented the detection of RIP bias in silico. Twocompletely identical M. graminicola rDNA repeat copiessuggest that they were not detected by RIP. Thepresence of identical rDNA sequences is expected asthese repeat clusters can evade RIP, although the exactmechanism is unknown [55].The presence of transition mutations in the M. grami-nicola repetitive sequences generated many stop codonsin the putative repeat-encoded protein domains. InN. crassa, RIP increased the occurrence of terminationcodons TAG and TAA [56]. However, a few elements inat least two families (families 623 and 18) were identifiedwith no stop codons, although the presence of transitionmutations in other parts of the elements indicated thatthe RIP machinery targeted these sequences. Lack ofstop codons in the reverse-transcriptase domain sug-gested that these elements might still be active in thegenome. Evidence for activity is also available at thetranscript level, with at least two repetitive families(families 623 and 242) represented in the M. graminicolaEST dataset. Therefore, these elements might be func-tional and active in the genome until they can be identi-fied and targeted by the RIP mechanism during sexualreproduction.Nested clusters of transposable elementsIn a number of eukaryotic genomes, TEs are nested orinserted into other TEs [57,58]. Although many classesof TEs target specific regions in the genome, no obviousinsertion preference for elements was identified in M.graminicola. As LTR retrotransposons were the mostabundant in the M. graminicola genome, the majority ofnested clusters identified had an LTR retrotransposon asthe base element. One of the mechanisms by whichLTR retrotransposons can target specific regions in thegenome has been analyzed in Saccharomyces cerevisiae[59]. In S. cerevisiae, the Ty5 LTR retrotransposon targetstelomeric heterochromatin as a result of direct interactionbetween retrotransposon-encoded integrase protein andthe silent information regulator 4 protein (Sir4), associatedwith heterochromatin [59]. Nested clusters also have beenreported in other fungi such as L. maculans [24] and Epi-chloë species [60], but were absent from other expandedfungal genomes, such as B. graminis [32]. Nested clustersalso have been mentioned in F. oxysporum [61], N. crassa[62] and M. grisea [63-65], but they are mainly in the pu-tative centromeric regions. The centromeres of M. grami-nicola have not been identified so whether they containrepeats is not known.Due to their larger genome size and higher repetitivecontent, nesting of TEs has been studied extensively inplants. In maize, where 85% of the genome is repetitive[66] and ten retrotransposon families make up morethan 25% of the genome, nested clusters may be a wayto reduce the deleterious effects of TEs [57]. Genes werefound in and near these nested TE clusters. Proliferationof a specific pathogenicity factor gene family unique tothe powdery mildew pathogen B. graminis has been at-tributed to a family of LINE retrotransposons [67]. Suchgene-repeat associations may hold an evolutionary andfunctional significance to the M. graminicola genome as ithas been previously shown that in fungal and oomycetegenomes, such repeat clusters act as ‘breeding grounds’,where new pathogenicity genes can be generated [9,24,30].ConclusionsRepetitive sequence analysis revealed that at least 16.7%of the M. graminicola genome was comprised of repeats.Dispensable chromosomes had a significantly higher re-peat content as compared to core chromosomes. Class Ielements occupied the largest fraction of the repetitivesequences in the M. graminicola genome. One family oftelomere-localized Penelope-like elements was identifiedin the M. graminicola genome. The distribution of re-peats was non-random on six chromosomes. Repeats inM. graminicola often were arranged in nested clusters of2–7 elements. Even though all the transposable elementswere riddled with transition mutations, there were puta-tive transcriptionally-active elements in the M. gra-minicola genome as inferred from the absence of stopcodons in EST sequences corresponding to at least threerepeat families in the EST dataset.MethodsIdentification of repeatsThree methods were used to identify the repetitivesequences in M. graminicola. Repeats were identified denovo using RECON [17] and RepeatScout [18], a k-merbased method. The output from these two programs wasDhillon et al. BMC Genomics 2014, 15:1132 Page 14 of 17http://www.biomedcentral.com/1471-2164/15/1132used to generate custom repeat libraries, which weresubsequently used to mask the M. graminicola genomeusing RepeatMasker [15]. Manually curated repeats spe-cific to fungi were obtained from RepBase Update [16].Repeats were also identified using another k-mer ap-proach, TALLYMER [20], and frequencies of repetitivesequences were plotted along the chromosomes usingGnuplot [68].Annotation of repeatsThe repeat families identified by RECON [17] were an-notated. The default output from RECON [17] wasparsed to include families with 10 or more elements.The longest element from each family was comparedagainst the NCBI non-redundant protein database usingBLAST [69] to identify protein domains. These proteindomains were used to classify repetitive sequences into dif-ferent classes. Structural features such as Long TerminalRepeats (LTRs) and Terminal Inverted Repeats (TIRs)were verified in LTR retrotransposon and DNA trans-poson families using POLYDOT and EINVERTED,respectively, both from the EMBOSS [70] package. TIRswere also identified for families with no known protein se-quences to identify Miniature Inverted Repeat TransposableElements (MITEs). Sequences with no known proteins orother structural features were grouped into the unclassifiedcategory.Insertion age estimation for LTR retrotransposonsPair-wise alignment of intact LTRs from each LTR retro-transposon was used for estimating the age of insertionevents. Numbers of substitutions were calculated foreach pair and translated into divergence time using asubstitution rate of 1.05 × 10−9 nucleotides per site peryear, as previously determined for fungi [71,72].Distribution of repeatsA non-parametric runs test for randomness was used todetermine whether the repetitive sequences on thechromosomes occur in a pattern or are random. Eachchromosome was divided into non-overlapping 100-kbbins and the extent of repetitive bases in each bin alongwith the average repetitive content/bin for the chromo-some were calculated. The Lawstat package [73] in R[74] was used to do the analysis. The above steps werealso repeated for non-overlapping, smaller 50-kb bins. Asimilar analysis for gene distribution was done using50-kb bins.Estimation of RIPElements in each family were aligned using clustalX [75]and the alignments were manually curated using Jalview[76]. These alignments were used for estimating RIP di-nucleotide bias using RIPcal [77]. For calculating percentidentity, pairwise comparisons between elements of 15repeat families were done. Elements with the highest GCcontent in each family were used as reference sequence.Sequence pairs with at least 99% sequence coverage wereevaluated.Nested elementsRelative chromosomal locations of the elements and theirfamily annotations were used to identify clusters of nestedelements. The base element was defined as the elementinto which all others were inserted. The alignment file forthe base element family was used to determine if differentparts of the base element made up a complete element.Target site duplication (TSD) of the base element in agiven cluster was identified manually by examining thesequence in ARTEMIS [78].Subtelomere organizationThe distribution of different repetitive elements and proteinswas analyzed over a 100-kb sequence from each end of the 21chromosomes and viewed with OmniMapFree (http://www.omnimapfree.org). Repetitive family annotation was used tocheck for families exclusive to subtelomeric regions. For anno-tation of telomere-associated repeats and gene families,BLASTsearches against theM. graminicola and NCBI ‘nr’ da-tabases were done. The non-LTR retrotransposon reversetranscriptase domain was used to classify subtelomeric non-LTR retrotransposons using a web-based tool, RTclass1 [79].Tandem repeatsTandem Repeat Finder [80] was used to do whole-genomeanalyses to identify tandem repeats. Both the unmaskedand masked (RM ‘nolow’ option) M. graminicola sequencewas used for this analysis. The results were parsed usingTandem Repeat Analysis Program [81].Availability of supporting dataThe data sets supporting the results of this article are availablein the LabArchives, LLC, repository, DOI 10.6070/H4222RRG[unique persistent identifier and hyperlink to dataset(s) inhttp:// format will be submitted and provided if accepted].EndnoteNames are necessary to report factually on availabledata. However, the USDA neither guarantees nor war-rants the standard of the product, and the use of thename implies no approval of the product to the exclu-sion of others that also may be suitable.Additional filesAdditional file 1: Table S1. Repetitive families identified in Mycosphaerellagraminicola by RECON. A summary of the general features of repetitivefamilies identified in the finished genome sequence of M. graminicola byDhillon et al. BMC Genomics 2014, 15:1132 Page 15 of 17http://www.biomedcentral.com/1471-2164/15/1132RECON. These features include family number, family name, element copynumber, base pair coverage, annotation, families that were merged together,LTR/IR length, TSD length, presence/absence of RIP and RIP index value.Additional file 2: Table S2. Repetitive families with fewer than 10repeat elements per family identified in the Mycosphaerella graminicolagenome. Repeat classification for 125 repetitive families, which had fewerthan 10 elements per family in the M. graminicola genome.Additional file 3: Table S3. Runs test for randomness of repetitivesequences across the Mycosphaerella graminicola chromosomes. Thistable lists the consecutive runs of 0 s (repetitive content in a non-overlapping 50-kb bin less than the chromosomal average) and 1 s(repetitive content of the 50-kb bin is above the chromosomal average).Competing interestsThe authors declare that they have no competing interests.Authors’ contributionsBD and SBG designed the analysis. BD performed the analysis and wrote theinitial draft. BD, NG, RH and SBG edited and finalized the manuscript. Allauthors read and approved the final manuscript.AcknowledgementsDNA sequencing of M. graminicola was performed at the U. S. Departmentof Energy’s Joint Genome Institute through the Community SequencingProgram (www.jgi.doe.gov/csp/) and all sequence data are publicly available.Supported by USDA CRIS project 3602-22000-015-00D. BD and RCH weresupported by Genome Canada Large Scale Applied Research Project 164DIA.Author details1Department of Forest and Conservation Sciences, 2424 Main Mall,Vancouver, BC V6T 1Z4, Canada. 2Department of Botany, Beaty BiodiversityCentre, 2212 Main Mall, Vancouver, BC V6T 1Z4, Canada. 3Natural ResourcesCanada, Laurentian Forestry Centre, 1055 du PEPS, Stn. Sainte-Foy, P.O. Box10380, Quebec, QC G1V 4C7, Canada. 4USDA-ARS, Crop Production and PestControl Research Unit, Purdue University, 915 W. State Street, West Lafayette,Indiana 47907-2054, USA.Received: 1 August 2014 Accepted: 12 December 2014Published: 17 December 2014References1. Palmer C-L, Skinner W: Mycosphaerella graminicola: latent infection, cropdevastation and genomics. Mol Plant Pathol 2002, 3(2):63–70.2. Goodwin S: Back to basics and beyond: increasing the level of resistance toSeptoria tritici blotch in wheat. Australas Plant Pathol 2007, 36(6):532–538.3. Fisher N, Griffin M: Benzimidazole (MBC) resistance in Septoria tritici.ISPP Chem Control Newsl 1984, 5:8–9.4. Fraaije BA, Burnett FJ, Clark WS, Motteram J, Lucas JA: Resistancedevelopment to QoI inhibitors in populations of Mycosphaerellagraminicola in the UK. In Modern fungicides and antifungal compounds IV.Edited by Dehne HW, Gisi U, Kuck KH, Russell PE, Alton LH. UK: British CropProtection Council; 2005:63–71.5. Mycosphaerella graminicola genome portal. http://genome.jgi-psf.org/Mycgr3/Mycgr3.home.html.6. Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A,Leroy P, Morgante M, Panaud O, Paux E, SanMiguel P, Schulman AH: Aunified classification system for eukaryotic transposable elements.Nat Rev Genet 2007, 8(12):973–982.7. McClintock B: The origin and behavior of mutable loci in maize. Proc NatlAcad Sci 1950, 36(6):344–355.8. Kazazian HH: Mobile elements: drivers of genome evolution. Science 2004,303(5664):1626–1632.9. Haas BJ, Kamoun S, Zody MC, Jiang RHY, Handsaker RE, Cano LM, Grabherr M,Kodira CD, Raffaele S, Torto-Alalibo T, Bozkurt TO, Ah-Fong AMV, Alvarado L,Anderson VL, Armstrong MR, Avrova A, Baxter L, Beynon J, Boevink PC,Bollmann SR, Bos JIB, Bulone V, Cai G, Cakir C, Carrington JC, Chawner M, ContiL, Costanzo S, Ewan R, Fahlgren N et al: Genome sequence and analysis ofthe Irish potato famine pathogen Phytophthora infestans. Nature 2009,461(7262):393–398.10. Jiang N, Bao Z, Zhang X, Eddy SR, Wessler SR: Pack-MULE transposableelements mediate gene evolution in plants. Nature 2004, 431(7008):569–573.11. Xiao H, Jiang N, Schaffner E, Stockinger EJ, van der Knaap E: Aretrotransposon-mediated gene duplication underlies morphologicalvariation of tomato fruit. Science 2008, 319(5869):1527–1530.12. Dhillon B, Cavaletto JR, Wood KV, Goodwin SB: Accidental amplificationand inactivation of a methyltransferase gene eliminates cytosinemethylation in Mycosphaerella graminicola. Genetics 2010, 186(1):67–77.13. Goodwin SB, Ben M’Barek S, Dhillon B, Wittenberg AHJ, Crane CF, Hane JK,Foster AJ, Van der Lee TAJ, Grimwood J, Aerts A, Antoniw J, Bailey A, BluhmB, Bowler J, Bristow J, van der Burgt A, Canto-Canché B, Churchill ACL,Conde-Ferràez L, Cools HJ, Coutinho PM, Csukai M, Dehal P, De Wit P,Donzelli B, van de Geest HC, van Ham RCHJ, Hammond-Kosack KE,Henrissat B, Kilian A et al: Finished genome of the fungal wheat pathogenMycosphaerella graminicola reveals dispensome structure, chromosomeplasticity, and stealth pathogenesis. PLoS Genet 2011, 7(6):e1002070.14. Wittenberg AHJ, van der Lee TAJ, Ben M’Barek S, Ware SB, Goodwin SB,Kilian A, Visser RGF, Kema GHJ, Schouten HJ: Meiosis drives extraordinarygenome plasticity in the haploid fungal plant pathogen Mycosphaerellagraminicola. PLoS ONE 2009, 4(6):e5863.15. Smit AFA, Hubley R, Green P: RepeatMasker Open3.0. 1996-2004 [http://repeatmasker.org]16. Jurka J: Repbase update: a database and an electronic journal ofrepetitive elements. Trends Genet 2000, 16(9):418–420.17. Bao Z, Eddy SR: Automated de novo identification of repeat sequencefamilies in sequenced genomes. Genome Res 2002, 12(8):1269–1276.18. Price AL, Jones NC, Pevzner PA: De novo identification of repeat familiesin large genomes. Bioinformatics 2005, 21(Suppl 1):i351–i358.19. Gladyshev EA, Arkhipova IR: Telomere-associated endonuclease-deficientPenelope-like retroelements in diverse eukaryotes. Proc Natl Acad Sci2007, 104(22):9352–9357.20. Kurtz S, Narechania A, Stein JC, Ware D: A new method to compute K-merfrequencies and its application to annotate large repetitive plantgenomes. BMC Genomics 2008, 9:517.21. Thon M, Pan H, Diener S, Papalas J, Taro A, Mitchell T, Dean R: The role oftransposable element clusters in genome evolution and loss of syntenyin the rice blast fungus Magnaporthe oryzae. Genome Biol 2006, 7(2):R16.22. Chen T, Manuelidis L: SINEs and LINEs cluster in distinct DNA fragmentsof Giemsa band size. Chromosoma 1989, 98(5):309–316.23. Acosta MJ, Marchal JA, Fernandez-Espartero CH, Bullejos M, Sanchez A:Retroelements (LINEs and SINEs) in vole genomes: Differential distributionin the constitutive heterochromatin. Chromosom Res 2008, 16(7):949–959.24. Rouxel T, Grandaubert J, Hane JK, Hoede C, van de Wouw AP, Couloux A,Dominguez V, Anthouard V, Bally P, Bourras S, Cozijnsen AJ, Ciuffetti LM,Degrave A, Dilmaghani A, Duret L, Fudal I, Goodwin SB, Gout L, Glaser N,Linglin J, Kema GHJ, Lapalu N, Lawrence CB, May K, Meyer M, Ollivier B,Poulain J, Schoch CL, Simon A, Spatafora JW, et al: Effector diversificationwithin compartments of the Leptosphaeria maculans genome affectedby Repeat-Induced Point mutations. Nat Commun 2011, 2:202.25. Fleetwood DJ, Khan AK, Johnson RD, Young CA, Mittal S, Wrenn RE, HesseU, Foster SJ, Schardl CL, Scott B: Abundant degenerate miniatureinverted-repeat transposable elements in genomes of Epichloid fungalendophytes of grasses. Genome Biol Evol 2011, 3:1253–1264.26. Ma L-J, van der Does HC, Borkovich KA, Coleman JJ, Daboussi M-J, Di PietroA, Dufresne M, Freitag M, Grabherr M, Henrissat B, Houterman PM, Kang S,Shim W-B, Woloshuk C, Xie X, Xu J-R, Antoniw J, Baker SE, Bluhm BH, BreakspearA, Brown DW, Butchko RAE, Chapman S, Coulson R, Coutinho PM, Danchin EGJ,Diener A, Gale LR, Gardiner DM, Goff S, et al: Comparative genomicsreveals mobile pathogenicity chromosomes in Fusarium. Nature 2010,464(7287):367–373.27. Coleman JJ, Rounsley SD, Rodriguez-Carres M, Kuo A, Wasmann CC,Grimwood J, Schmutz J, Taga M, White GJ, Zhou S, Schwartz DC, Freitag M,Ma L-J, Danchin EGJ, Henrissat B, Coutinho PM, Nelson DR, Straney D, NapoliCA, Barker BM, Gribskov M, Rep M, Kroken S, Molnár I, Rensing C, Kennell JC,Zamora J, Farman ML, Selker EU, Salamov A, et al: The Genome of Nectriahaematococca: Contribution of Supernumerary Chromosomes to GeneExpansion. PLoS Genet 2009, 5(8):e1000618.28. Kalendar R, Tanskanen J, Immonen S, Nevo E, Schulman AH: Genome evolutionof wild barley (Hordeum spontaneum) by BARE-1 retrotransposon dynamicsin response to sharp microclimatic divergence. Proc Natl Acad Sci U S A 2000,97(12):6603–6607.Dhillon et al. BMC Genomics 2014, 15:1132 Page 16 of 17http://www.biomedcentral.com/1471-2164/15/113229. Piegu B, Guyot R, Picault N, Roulin A, Saniyal A, Kim H, Collura K, Brar DS,Jackson S, Wing RA, Panaud O: Doubling genome size withoutpolyploidization: dynamics of retrotransposition-driven genomicexpansions in Oryza australiensis, a wild relative of rice. Genome Res 2006,16(10):1262–1269.30. de Wit PJGM, van der Burgt A, Okmen B, Stergiopoulos I, Abd-Elsalam KA,Aerts AL, Bahkali AH, Beenen HG, Chettri P, Cox MP, Datema E, de Vries RP,Dhillon B, Ganley AR, Griffiths SA, Guo Y, Hamelin RC, Henrissat B, Kabir MS,Jashni MK, Kema G, Klaubauf S, Lapidus A, Levasseur A, Lindquist E, Mehrabi R,Ohm RA, Owen TJ, Salamov A, Schwelm A, et al: The genomes of the fungalplant pathogens Cladosporium fulvum and Dothistroma septosporum revealadaptation to different hosts and lifestyles but also signatures of commonancestry. PLoS Genet 2012, 8(11):e1003088.31. Ohm RA, Feau N, Henrissat B, Schoch CL, Horwitz BA, Barry KW, Condon BJ,Copeland AC, Dhillon B, Glaser F, Hesse CN, Kosti I, LaButti K, Lindquist EA,Lucas S, Salamov AA, Bradshaw RE, Ciuffetti L, Hamelin RC, Kema GHJ,Lawrence C, Scott JA, Spatafora JW, Turgeon BG, de Wit PJGM, Zhong S,Goodwin SB, Grigoriev IV: Diverse lifestyles and strategies of plantpathogenesis encoded in the genomes of eighteen Dothideomycetesfungi. PLoS Pathog 2012, 8(12):e1003037.32. Spanu PD, Abbott JC, Amselem J, Burgis TA, Soanes DM, Stuber K, vanThemaat EV L, Brown JKM, Butcher SA, Gurr SJ, Lebrun M-H, Ridout CJ,Schulze-Lefert P, Talbot NJ, Ahmadinejad N, Ametz C, Barton GR, Benjdia M,Bidzinski P, Bindschedler LV, Both M, Brewer MT, Cadle-Davidson L, Cadle-Davidson MM, Collemare J, Cramer R, Frenkel O, Godfrey D, Harriman J,Hoede C, et al: Genome expansion and gene loss in powderymildew fungi reveal tradeoffs in extreme parasitism. Science 2010,330(6010):1543–1546.33. Ma J, Devos KM, Bennetzen JL: Analyses of LTR-retrotransposon structuresreveal recent and rapid genomic DNA loss in rice. Genome Res 2004,14(5):860–869.34. Devos KM, Brown JK, Bennetzen JL: Genome size reduction throughillegitimate recombination counteracts genome expansion inArabidopsis. Genome Res 2002, 12(7):1075–1079.35. Asami Y, Jia DW, Tatebayashi K, Yamagata K, Tanokura M, Ikeda H: Effect ofthe DNA topoisomerase II inhibitor VP-16 on illegitimate recombinationin yeast chromosomes. Gene 2002, 291(1–2):251–257.36. Petrov DA, Lozovskaya ER, Hartl DL: High intrinsic rate of DNA loss inDrosophila. Nature 1996, 384(6607):346–349.37. Galagan JE, Calvo SE, Cuomo C, Ma L-J, Wortman JR, Batzoglou S, Lee S-I,Basturkmen M, Spevak CC, Clutterbuck J, Kapitonov V, Jurka J, ScazzocchioC, Farman M, Butler J, Purcell S, Harris S, Braus GH, Draht O, Busch S, D'EnfertC, Bouchier C, Goldman GH, Bell-Pedersen D, Griffiths-Jones S, Doonan JH,Yu J, Vienken K, Pain A, Freitag M, et al: Sequencing of Aspergillus nidulansand comparative analysis with A. fumigatus and A. oryzae. Nature 2005,438(7071):1105–1115.38. Goodwin TJ, Butler MI, Poulter RT: Cryptons: a group of tyrosine-recombinase-encoding DNA transposons from pathogenic fungi. Microbiology 2003,149(Pt 11):3099–3109.39. Gao W, Khang CH, Park SY, Lee YH, Kang S: Evolution and organization ofa highly dynamic, subtelomeric helicase gene family in the rice blastfungus Magnaporthe grisea. Genetics 2002, 162(1):103–112.40. Louis EJ: The chromosome ends of Saccharomyces cerevisiae. Yeast 1995,11(16):1553–1573.41. Sanchez-Alonso P, Guzman P: Organization of chromosome ends inUstilago maydis. RecQ-like helicase motifs at telomeric regions. Genetics1998, 148(3):1043–1054.42. Cogoni C, Macino G: Posttranscriptional gene silencing in Neurospora bya RecQ DNA helicase. Science 1999, 286(5448):2342–2344.43. Chakraverty RK, Hickson ID: Defending genome integrity during DNAreplication: a proposed role for RecQ family helicases. Bioessays 1999,21(4):286–294.44. Shen JC, Loeb LA: The Werner syndrome gene: the molecular basis ofRecQ helicase-deficiency diseases. Trends Genet 2000, 16(5):213–220.45. Louis EJ, Naumova ES, Lee A, Naumov G, Haber JE: The chromosome endin yeast: its mosaic nature and influence on recombinational dynamics.Genetics 1994, 136(3):789–802.46. Lundblad V, Blackburn EH: An alternative pathway for yeast telomeremaintenance rescues est1- senescence. Cell 1993, 73(2):347–360.47. Arkhipova IR: Distribution and phylogeny of Penelope-like elements ineukaryotes. Syst Biol 2006, 55(6):875–885.48. George JA, DeBaryshe PG, Traverse KL, Celniker SE, Pardue M-L: Genomicorganization of the Drosophila telomere retrotransposable elements.Genome Res 2006, 16(10):1231–1240.49. Fujiwara H, Osanai M, Matsumoto T, Kojima K: Telomere-specific non-LTRretrotransposons and telomere maintenance in the silkworm, Bombyxmori. Chromosome Res 2005, 13(5):455–467.50. Arkhipova IR, Morrison HG: Three retrotransposon families in the genomeof Giardia lamblia: Two telomeric, one dead. Proc Natl Acad Sci 2001,98(25):14497–14502.51. Watters MK, Randall TA, Margolin BS, Selker EU, Stadler DR: Action ofrepeat-induced point mutation on both strands of a duplex and ontandem duplications of various sizes in Neurospora. Genetics 1999,153(2):705–714.52. Garcia C, Marshall D: Observations on the ascogenous stage of Septoriatritici in Texas. Mycol Res 1992, 96:65–70.53. Cambareri EB, Jensen BC, Schabtach E, Selker EU: Repeat-induced G-C toA-T mutations in Neurospora. Science 1989, 244(4912):1571–1575.54. Cambareri EB, Singer MJ, Selker EU: Recurrence of repeat-induced pointmutation (RIP) in Neurospora crassa. Genetics 1991, 127(4):699–710.55. Galagan JE, Selker EU: RIP: the evolutionary cost of genome defense.Trends Genet 2004, 20(9):417–423.56. Singer MJ, Marcotte BA, Selker EU: DNA methylation associated withrepeat-induced point mutation in Neurospora crassa. Mol Cell Biol 1995,15(10):5586–5597.57. SanMiguel P, Tikhonov A, Jin YK, Motchoulskaia N, Zakharov D, Melake-Berhan A,Springer PS, Edwards KJ, Lee M, Avramova Z, Bennetzen JL: Nestedretrotransposons in the intergenic regions of the maize genome. Science1996, 274(5288):765–768.58. Baucom RS, Estill JC, Chaparro C, Upshaw N, Jogi A, Deragon J-M, Westerman RP,SanMiguel PJ, Bennetzen JL: Exceptional diversity, non-random distribution,and rapid evolution of retroelements in the B73 maize genome. PLoS Genet2009, 5(11):e1000732.59. Xie W, Gai X, Zhu Y, Zappulla DC, Sternglanz R, Voytas DF: Targeting of theyeast Ty5 retrotransposon to silent chromatin is mediated by interactionsbetween integrase and Sir4p. Mol Cell Biol 2001, 21(19):6606–6614.60. Schardl CL, Young CA, Hesse U, Amyotte SG, Andreeva K, Calie PJ,Fleetwood DJ, Haws DC, Moore N, Oeser B, Panaccione DG, Schweri KK,Voisey CR, Farman ML, Jaromczyk JW, Roe BA, O'Sullivan DM, Scott B,Tudzynski P, An Z, Arnaoudova EG, Bullock CT, Charlton ND, Chen L, Cox M,Dinkins RD, Florea S, Glenn AE, Gordon A, Güldener U, et al: Plant-symbioticfungi as chemical engineers: multi-genome analysis of the Clavicipitaceaereveals dynamics of alkaloid loci. PLoS Genet 2013, 9(2):e1003323.61. Hua-Van A, Daviere JM, Kaper F, Langin T, Daboussi MJ: Genome organizationin Fusarium oxysporum: clusters of class II transposons. Curr Genet 2000,37(5):339–347.62. Cambareri EB, Aisner R, Carbon J: Structure of the chromosome VIIcentromere region in Neurospora crassa: degenerate transposons andsimple repeats. Mol Cell Biol 1998, 18(9):5465–5477.63. Valent B, Chumley FG: Avirulence genes and mechanisms of geneticinstability in the rice blast fungus. In The Rice Blast Disease. Edited by ZeiglerRS, Leong SA, Teng PS. Cambridge, UK: Oxford University Press; 1994:111–134.64. Nitta N, Farman ML, Leong SA: Genome organization of Magnaporthegrisea: integration of genetic maps, clustering of transposable elementsand identification of genome duplications and rearrangements.TAG Theor Appl Genet 1997, 95(1):20–32.65. Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ, Thon M,Kulkarni R, Xu JR, Pan H, Read ND, Lee YH, Carbone I, Brown D, Oh YY,Donofrio N, Jeong JS, Soanes DM, Djonovic S, Kolomiets E, Rehmeyer C, Li W,Harding M, Kim S, Lebrun MH, Bohnert H, Coughlan S, Butler J, Calvo S, Ma L-J,et al: The genome sequence of the rice blast fungus Magnaporthe grisea.Nature 2005, 434(7036):980–986.66. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J,Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C,Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K,Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B,et al: The B73 maize genome: complexity, diversity, and dynamics. Science2009, 326(5956):1112–1115.67. Sacristan S, Vigouroux M, Pedersen C, Skamnioti P, Thordal-Christensen H,Micali C, Brown JKM, Ridout CJ: Coevolution between a family of parasitevirulence effectors and a class of LINE-1 retrotransposons. PLoS ONE2009, 4(10):e7463.68. Williams T, Kelley C: Gnuplot 4.4: an interactive plotting program. 2010[http://gnuplot.sourceforge.net/]69. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignmentsearch tool. J Mol Biol 1990, 215(3):403–410.70. Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular BiologyOpen Software Suite. Trends Genet 2000, 16(6):276–277.71. Berbee ML, Taylor JW: Dating the molecular clock in fungi - how close arewe? Fungal Biol Rev 2010, 24(1–2):1–16.72. Kasuga T, White TJ, Taylor JW: Estimation of nucleotide substitution ratesin Eurotiomycete fungi. Mol Biol Evol 2002, 19(12):2318–2324.73. Noguchi K, Hui WLW, Gel YR, Gastwirth JL, Miao W: lawstat: An R packagefor biostatistics, public policy, and law. 2009 [http://CRAN.R-project.org/package=lawstat]74. R Development Core Team: R: A language and environment for statisticalcomputing. 2011 [http://www.R-project.org]75. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliamH, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ,Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics 2007,23(21):2947–2948.76. Clamp M, Cuff J, Searle SM, Barton GJ: The Jalview Java alignment editor.Bioinformatics 2004, 20(3):426–427.77. Hane JK, Oliver RP: RIPCAL: a tool for alignment-based analysis of repeat-induced point mutations in fungal genomic sequences. BMC Bioinformatics2008, 9:478.78. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA, Barrell B:Artemis: sequence visualization and annotation. Bioinformatics 2000,Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionDhillon et al. BMC Genomics 2014, 15:1132 Page 17 of 17http://www.biomedcentral.com/1471-2164/15/113216(10):944–945.79. Kapitonov VV, Tempel S, Jurka J: Simple and fast classification of non-LTRretrotransposons based on phylogeny of their RT domain proteinsequences. Gene 2009, 448(2):207–213.80. Benson G: Tandem repeats finder: a program to analyze DNA sequences.Nucleic Acids Res 1999, 27(2):573–580.81. Sobreira TJ, Durham AM, Gruber A: TRAP: automated classification,quantification and annotation of tandemly repeated sequences.Bioinformatics 2006, 22(3):361–362.doi:10.1186/1471-2164-15-1132Cite this article as: Dhillon et al.: The landscape of transposableelements in the finished genome of the fungal wheat pathogenMycosphaerella graminicola. BMC Genomics 2014 15:1132.Submit your manuscript at www.biomedcentral.com/submit


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items