UBC Faculty Research and Publications

Draft genome of the mountain pine beetle, Dendroctonus ponderosae Hopkins, a major forest pest Keeling, Christopher I; Yuen, Macaire M; Liao, Nancy Y; Roderick Docking, T; Chan, Simon K; Taylor, Greg A; Palmquist, Diana L; Jackman, Shaun D; Nguyen, Anh; Li, Maria; Henderson, Hannah; Janes, Jasmine K; Zhao, Yongjun; Pandoh, Pawan; Moore, Richard; Sperling, Felix A; W Huber, Dezene P; Birol, Inanc; Jones, Steven J; Bohlmann, Joerg Mar 27, 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-13059_2012_Article_3065.pdf [ 1.81MB ]
JSON: 52383-1.0215991.json
JSON-LD: 52383-1.0215991-ld.json
RDF/XML (Pretty): 52383-1.0215991-rdf.xml
RDF/JSON: 52383-1.0215991-rdf.json
Turtle: 52383-1.0215991-turtle.txt
N-Triples: 52383-1.0215991-rdf-ntriples.txt
Original Record: 52383-1.0215991-source.json
Full Text

Full Text

Draft genome of the mountain pine beetle,Dendroctonus ponderosae Hopkins, a major forestpestKeeling et al.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27 (27 March 2013)RESEARCH Open AccessDraft genome of the mountain pine beetle,Dendroctonus ponderosae Hopkins, a major forestpestChristopher I Keeling1, Macaire MS Yuen1, Nancy Y Liao2, T Roderick Docking2, Simon K Chan2, Greg A Taylor2,Diana L Palmquist2, Shaun D Jackman2, Anh Nguyen1, Maria Li1, Hannah Henderson1, Jasmine K Janes3,Yongjun Zhao2, Pawan Pandoh2, Richard Moore2, Felix AH Sperling3, Dezene P W Huber4, Inanc Birol2,5,Steven JM Jones2,5,6 and Joerg Bohlmann1*AbstractBackground: The mountain pine beetle, Dendroctonus ponderosae Hopkins, is the most serious insect pest ofwestern North American pine forests. A recent outbreak destroyed more than 15 million hectares of pine forests,with major environmental effects on forest health, and economic effects on the forest industry. The outbreak hasin part been driven by climate change, and will contribute to increased carbon emissions through decaying forests.Results: We developed a genome sequence resource for the mountain pine beetle to better understand theunique aspects of this insect’s biology. A draft de novo genome sequence was assembled from paired-end, short-read sequences from an individual field-collected male pupa, and scaffolded using mate-paired, short-readgenomic sequences from pooled field-collected pupae, paired-end short-insert whole-transcriptome shotgunsequencing reads of mRNA from adult beetle tissues, and paired-end Sanger EST sequences from various lifestages. We describe the cytochrome P450, glutathione S-transferase, and plant cell wall-degrading enzyme genefamilies important to the survival of the mountain pine beetle in its harsh and nutrient-poor host environment,and examine genome-wide single-nucleotide polymorphism variation. A horizontally transferred bacterial sucrose-6-phosphate hydrolase was evident in the genome, and its tissue-specific transcription suggests a functional role forthis beetle.Conclusions: Despite Coleoptera being the largest insect order with over 400,000 described species, includingmany agricultural and forest pest species, this is only the second genome sequence reported in Coleoptera, andwill provide an important resource for the Curculionoidea and other insects.Keywords: Coleoptera, Curculionoidea, Scolytinae, bark beetles, conifer, cytochrome P450, glutathione S-transferase,plant cell wall-degrading enzymes, horizontal gene transfer, sex chromosomesBackgroundThe order Coleoptera (beetles) is the most species-richorder of insects, with over 400,000 described species [1],yet to date only one coleopteran genome sequence hasbeen published, that of the red flour beetle (Tribolium cas-taneum, superfamily Tenebrionoidea), a pest of storedgrain products [2]. The superfamily Curculionoidea(weevils) diverged from Tenebrionoidea 236 million yearsago (Mya) [3], and contains over 60,000 described species,including many of the world’s insect pest species. Onesuch group of pest species are the bark beetles (subfamilyScolytinae), encompassing over 6,000 species in approxi-mately 220 genera. Currently, one of the most destructivebark beetle species is the mountain pine beetle (MPB),Dendroctonus ponderosae Hopkins. MPB has had a longrecorded history of major outbreaks in western NorthAmerica, and attacks many pine species (Pinus spp.) [4].The current MPB epidemic far exceeds the scope of any* Correspondence: bohlmann@msl.ubc.ca1Michael Smith Laboratories, University of British Columbia, 301-2185 EastMall, Vancouver, BC, Canada V6T 1A4Full list of author information is available at the end of the articleKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27© 2013 Keeling et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.previously recorded bark beetle outbreak, with over 15million hectares of pine forests, predominantly lodgepolepine (Pinus contorta), infested in British Columbia alone[4]. In recent years, MPB has been found east of the north-ern Canadian Rockies, which was previously thought to bean effective geographical barrier [5]. This range expansionincludes infestation of pine species not previously encoun-tered by MPB, particularly jack pine (Pinus banksiana)and its hybrid with lodgepole pine [6]. As the predomi-nantly jack pine boreal forest extends to the Atlantic coast,the potential for this beetle to spread further eastward is ofmajor ecological, environmental, and economic concern[4,7].MPB is one of 16 species of Dendroctonus described inthe New World, with habitats ranging from Nicaragua toarctic North America [8], and additional single species inboth northern Europe and Asia. As members of the tribeTomicini in the Scolytinae, Dendroctonus have ancientassociations with conifers [9]. Most Dendroctonus speciesare capable of killing their conifer hosts, an ancestral abil-ity of this genus [10], and several species are consideredserious pests. MPB is present over a wide latitude range,has thus adapted to wide temperature ranges, and has ameasurable spatial genetic structure [11]. The ability ofMPB and Dendroctonus in general to inhabit a wide rangeof latitudes may foretell future range expansion withanticipated climate change [4,12], and may substantiallyaffect future carbon cycles as beetle-killed trees decay orburn to release their stored carbon [7,13].The success of MPB and many other bark beetle spe-cies in overcoming the defenses of their conifer host[14,15] and in colonizing the trees is due in part to theirpheromone-mediated mass attack on individual trees[16]. Both male and female beetles produce aggregationpheromones that effectively initiate and modulate themass attack. These compounds are identical or similaracross most Dendroctonus species and some other Scoly-tinae [17]. Another factor in their success in killing treesoriginates from the symbiotic fungi that the beetles vec-tor to new host trees. These fungi, such as the pathogenicblue stain ascomycete, Grosmannia clavigera (for whichextensive genomic resources are available [18-21]) infil-trate the sapwood of the tree and effectively block watertransport. Both the beetles and associated fungi probablycontribute to detoxification and metabolism of defensechemicals in the host pine.Although transcriptomic resources are available for Sco-lytinae [22-24], the genome sequence of MPB will providea valuable new reference for further studies in this andrelated Coleoptera. A challenge with assembling the gen-ome of non-model organisms such as MPB is the difficultyor impossibility of obtaining highly inbred individuals toreduce the heterozygosity. As the cost of sequencing con-tinues to drop, genome sequencing will be feasible formany more organisms, many of which are not practical oramenable to extensive inbreeding. Thus, the assembly pro-cesses must be adapted to resolve the issues of greater het-erozygosity for assembly of diploid genomes into haploid,or alternatively diploid, assemblies.We report the draft de novo genome sequence of MPB,and describe several highlights including the presence ofa horizontally transferred gene, the identification of gen-ome sequence representing the sex chromosomes, gen-ome-wide single-nucleotide polymorphism (SNP)variation and distribution, and gene families with roles inhost colonization.Results and discussionAssemblyOwing to the univoltine lifecycle of MPB and the difficultyin rearing it through many generations in the laboratory,we chose to sequence wild-collected insects. The genomesequence was assembled from over 400× coverage ofshort-read paired end tag (PET) sequencing of genomicDNA from an individual, field-collected, male MPB pupa,and was scaffolded with over 300× coverage of short-readmate-paired end tag (MPET) sequencing of 6,600, 10,000,and 12,000 bp fragment sizes with genomic DNA from apool of mixed-sex pupae (Table 1). In addition, scaffoldswere merged when supported by scaffold-spanning Sangerexpressed sequence tag (EST) and/or RNA sequencing(RNA-seq) data from various life stages and tissues [24].Examination of the assembly identified a fraction ofsequences from a gammaproteobacterium most similar toAcinetobacter. The microbial community associated withDendroctonus spp. is known to be diverse, and Acinetobac-ter spp. have previously been shown to associate with barkbeetles [25-27]. However, we could not ascertain whetherthe bacterial sequences originated from a symbiotic asso-ciation with the original MPB pupa sequenced, as Acineto-bacter sequences were absent from sequences originatingfrom the MPB pupae used for scaffolding and the femaleadult assembly, suggesting an environmental associationrather than a symbiotic relationship. A total of 264 scaf-folds (2.4 Mbp) that only contained gene models with bestmatches to Acinetobacter genes were removed from themale MPB assembly after comparing the assembly withthe complete genome of Acinetobacter lwoffii [28] usingBLASTn and BLASTx.After removal of Acinetobacter sequences, the assemblyresulted in 8,460 scaffolds greater than 1,000 bp in size,and an N50 of 580,960 bp (Table 2). The reconstructedgenome size of 204 Mbp was comparable with the esti-mated genome size of 208 Mbp for MPB (Gregory et al.,submitted paper), a value similar to the 204 Mbp forT. castaneum [2]. The G+C content of the MPB assemblywas similar to that of T. castaneum (36% versus 33% G+C,respectively) [2]. We identified 13,088 gene models, ofKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 2 of 19which 92% were supported by significant protein homol-ogy with the National Centre for Biotechnology Informa-tion (NCBI) nr database and we found that 96.4% of theultra-conserved core eukaryotic genes [29] in the maleassembly were complete, and 3.2% were partial (Table 2).Sex chromosomes and shared synteny with the T.castaneum genomeSpecies in the genus Dendroctonus have a wide range ofmale meiotic karyotypic formulae, from 5 AA + Xyp inDendroctonus mexicanus to 14 AA + Xyp in Dendroctonusrufipennis [30-33]. Identifying segments of the genomeassembly representing the sex chromosomes in MPB hasimportant implications for investigating sex-specific differ-ences in MPB, such as pheromone production and sizedimorphism, and how the sex-specific differences in thisand other Dendroctonus species originate. In the case ofMPB and its sibling species, Dendroctonus jeffreyi, the kar-yotype is 11 AA + neo-XY [31]. This arrangement (seeAdditional file 1, Figure S1) is thought to have originatedfrom an ancestral state of 12 AA + Xyp by a fusion of the× chromosome with the largest autosome to become neo-X, followed by a loss of the ancestral yp, resulting in thehomozygous daughter chromosome becoming neo-Y [32].Therefore, the largest chromosomes in MPB are the neo-X and neo-Y sex chromosomes [31]. Given that wesequenced a male pupa, and males are the heterogameticsex, but physical or linkage maps are currently lacking, weattempted to identify scaffolds in the assembly that mayhave originated from the sex chromosomes. We hypothe-sized that the single-nucleotide variant (SNV) densityshould be very low for scaffolds that originate from theancestral × portion of neo-X (as there would only be onecopy in a male), or for regions of the neo-X and neo-Yoriginating from the ancestral autosome, which have suffi-ciently diverged to occur in the assembly as separate scaf-folds (as there would be one copy of each in the male). Toquantify this, we mapped the genomic DNA reads fromthe individual male back onto the assembly and measuredthe SNV density for each scaffold. We found that a fewlarge scaffolds (Seq_1101913, Seq_1101939, Seq_1102308,Seq_1102689, Seq_1102713, and Seq_1102823) had verylow SNV densities (as low as 0.006 SNVs per kb, for scaf-folds greater than 1 Mbp) (Figure 1), whereas the overallSNV density of the assembly was 0.48 SNVs/kbp. Therewere smaller scaffolds up to 84 kbp with 0 SNVs/kbp,which may be additional unique pieces of the sexchromosomes.Although the common ancestor of MPB and T. castaneumdiverged 236 Mya [3], we hypothesized that there may beconserved shared synteny between the two species, parti-cularly in the × sex chromosome, even though maleTable 1 Data used in assemblies and scaffoldingSequence type Number of reads, millions readlength, bpTotal coverage, Gbp Fragment length, bp Sequence coverageMale PET 387 76-114 41 630 200×Male PET 480 100-114 50 590 243×Female PET 328 150 49 425 227×MPET 610 76-101 59 6600 290×MPET 72 51 3.7 10000 18×MPET 90 51 4.6 12000 22×RNA-seq 257 51-76 13 170 NAPaired end Sanger ESTs 0.18 750 0.135 1100 NAEST, Expressed sequence tag, MPET, mate-paired end tag; NA, not applicable; seq, sequencing..Table 2 Assembly statisticsMale number Female numberNumber of contigs 59583 40744N50 contigs, bp 7451 10101Largest contig, bp 225798 158947Number of scaffolds >1,000 bp 8460 6547N50 scaffolds, bpa 597806 382123Number of scaffolds >N50 76 136Largest scaffold, bpa 3746698 6768731Reconstruction, Mbpa 202 213Gene models 13088 12873Ultra-conserved core eukaryotic genes, complete/partial, % 96.4/3.2 98.4/1.6aExcluding Ns.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 3 of 19T. castaneum have a 9 AA+Xyp karyotype. We usedtBLASTx to compare the MPB scaffolds with the linkagegroups in the T. castaneum genome (Figure 2A). Consis-tent with our hypothesis, the six scaffolds identified withvery low SNV densities matched strongly to LG1=X in T.castaneum, while the other long MPB scaffolds matchedstrongly to other T. castaneum linkage groups. We alsofound that scaffolds matching strongly to LG4 werenoticeably shorter than scaffolds matching to other linkagegroups. Owing to difficulties in assembling similar butdivergent sequences shared between neo-Y and neo-Xinto a haploid assembly, partially redundant and shorterscaffolds were likely to result. This scenario would be con-sistent with LG4 sharing a common origin as the ancestralautosome that became neo-Y and part of neo-X in MPB.By contrast, sequence assembly from an individual femalewould not have these challenges (being neo-XX), sowe sequenced an adult female beetle as well (Table 1,Table 2). Consistent with our hypothesis, we found thatthe scaffolds with gene models matching mostly to LG4 inT. castaneum (Figure 2B) were, on average, five timeslonger in the female MPB assembly than in the male MPBassembly, and the number of scaffolds was smaller (45 ver-sus 98; Figure 2B). For the most part, the MPB scaffoldsdid not match defined positions in the T. castaneum link-age groups (Figure 2); rather, the matches were evenlyspread out across a linkage group. This suggests that therehave been significant intrachromosomal rearrangementssince the separation of the two species from a commonancestor, more so than interchromosomal rearrangements.However, for scaffolds with gene models matching to link-age groups 3 and 8 in T. castaneum, some localization ofthe matches was apparent, possibly indicating past largeinterchromosomal rearrangements or fusions.To examine the shared synteny in more detail for LG1=Xand the MPB scaffolds hypothesized to be part of theancestral × portion of neo-X, we matched the gene modelson these MPB scaffolds to the gene models in T. castaneum(Figure 3). Although approximately 70% of the gene modelson these scaffolds matched gene models on LG1=X, largeFigure 1 Heterozygosity of an individual male. The individual male mountain pine beetle (MPB) sequence data used for the assembly wasmapped back onto the assembly, and the level of heterozygosity (allelic variation) was determined. Inset, A restricted range of single-nucleotidevariant (SNV) density. Red markers indicate scaffolds with very low SNV density, which are hypothesized to represent scaffolds on the ancestral ×chromosome portion of the neo-X chromosome. These are the six scaffolds shown in Figure 3.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 4 of 19stretches of shared synteny were not apparent, indicatingthe extent of intrachromosomal arrangements that haveoccurred.We thus identified scaffolds in the assembly that werelikely to be on the sex chromosomes, but additionalexperimental data such as linkage maps are necessary forconfirmation. In addition, we determined that althoughMPB and T. castaneum diverged more than 200 Mya,there was still evidence of shared synteny, althoughmajor intrachromosomal rearrangements were alsoapparent.Repetitive elementsKnown arthropod and novel MPB repetitive elementswere detected with RepeatMasker, RepBase Update, andRepeatScout. Repetitive elements occupied approximately17 and 23% of the male and female genome assemblies,respectively (see Additional file 3, Table 1). This percen-tage is in the range of 8 to 42% that has been found inother insects [34]. Only 7% of the repetitive sequencehad similarity to the known arthropod repeats inRepBase. The remainder appeared to be unique to MPB,with 3,429 and 2,941 novel elements appearing at least10 times in the male and female genome assemblies,respectively. When these novel repeats were used toexamine the T. castaneum genome, only 0.15% of thisgenome contained any of these novel MPB repeats, sug-gesting very little commonality between repeats inColeoptera.Horizontal gene transferIn the gene predictions of three male scaffolds and onefemale scaffold, we found a nearly identical gene, whichhad a high similarity match to sucrose-6-phosphatehydrolases (scrB) from enterobacteria, particularly Kleb-siella spp. and Rahnella aquatilis (BLASTp e-value <1 ×10-140, 49% amino acid identity) (Figure 4A). These bac-teria, especially R. aquatilis, are known to be associatedwith MPB [35], Dendroctonus frontalis [36], Dendrocto-nus rhizophagus [37], and Dendroctonus valens [27].However, there was no evidence that other genes fromthese bacteria were present in the assembly. To confirmthat this gene model was part of the MPB genome andnot an assembly artifact of contaminating DNA, we suc-cessfully amplified and sequence verified a section of thegenomic DNA about 4 kbp in length, which includedboth an adjacent beetle gene and the putative transferredbacterial gene. Transcripts corresponding to this locusFigure 2 Shared synteny between male and female mountain pine beetle (MPB) assembly scaffolds and Tribolium castaneum linkagegroups. Sequences were compared by tBLASTx and regions of significant similarity (e-value <1 × 10-20) are indicated by lines representing eachhigh-scoring segment pair (appearing as dots at this scale). MPB scaffolds are displayed ordered from shortest to longest. The 20 longest MPBscaffolds are demarcated with faint horizontal lines. (A) Male and (B) female MPB scaffolds. (B) is longer than (A) due to a larger reconstructionsize (including Ns) in the female compared with the male. A series of horizontal dots within one T. castaneum linkage group indicates a MPBscaffold sharing similarity with this linkage group. A linkage group for the yp chromosome in T. castaneum has not been described.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 5 of 19(Dpon-scrB) are present in the transcriptome of beetlescollected in different geographic regions in Canada[NCBI GAFW00000000 and GAFX00000000, 24] and theUSA [NCBI dbEST GO486754, 23]. In addition, highlysimilar orthologous sequences (tBLASTn 88% identity,e-value <1 × 10-165) were found in transcriptomesequences from the southern pine beetle (D, frontalis;Dfro-scrB) (Keeling et al. in preparation; NCBI GAFI00000000). However, orthologous sequences could not befound in the available EST, nucleotide, and/or SRA tran-scriptomic data publically available at NCBI for more dis-tantly related Scolytinae such as the coffee berry borer(Hypothenemus hampei, NCBI dbEST FD661949-FD663980), the pine engraver beetle (Ips pini) [22], or theSeq_1101913Seq_1101939Seq_1102823Seq_1102713Seq_1102308Seq_1102689LG1=X2 MbpTriboliumMPBFigure 3 Shared synteny between Tribolium castaneum LG1 = × and scaffolds representing the ancestral × portion of neo-X of themale mountain pine beetle (MPB) assembly. Each trapezoid connects a matching gene model between the two organisms: red trapezoid,parallel orientation; green trapezoid, anti-parallel orientation. Scaffolds Seq_1101913, Seq_1101939, and Seq_1102823 partially overlap, andscaffolds Seq_1102689, Seq_1102308, and Seq_1102713 are contained in one scaffold in the female assembly. The order and orientation of thesetwo groups of scaffolds are otherwise arbitrary.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 6 of 19Figure 4 Horizontal gene transfer. (A) Schematic of the location of the sucrose-6-phosphate hydrolase gene (scrB, red arrow) on the male andfemale scaffolds. Green arrows indicate adjacent gene models with similarity to insect proteins by BLASTx (post-glycosylphosphatidylinositol (GPI)attachment to proteins factor 2-like (pgap2-like) and hypothetical protein, (hypo)). Blue arrows indicate the location of primers used to amplifyintergenic regions between horizontal gene transfer (HGT) and adjacent beetle genes. Grey dashed lines indicate the similar gene models on thedifferent scaffolds. (B) Presence of scrB in different beetle species. Green check marks and red Xs indicate presence and absence of the sucrose-6-phosphate hydrolase gene, respectively. Abbreviations: Dfro, Dendroctonus frontalis; Dmic, Dendroctonus micans; Dpon, Dendroctonus ponderosae;Dpun, Dendroctonus punctatus; Ipin, Ips pini; Ityp, Ips typographus; Pstr, Pissodes strobi; Tcas, Tribolium castaneum. Divergence dates estimated fromSequeira and Farrell [40]. (C) Phylogeny of scrB proteins from Dendroctonus (in red) with Gammaproteobacteria scrB and similar insect proteins.Abbreviations: bFF, beta-fructofuranosidase; Bimp, Bombus impatiens; Blic (in purple), Bacillus licheniformis; Bmor, Bombyx mori; Cfre, Citrobacter freundii;Cint, Commensalibacter intestine; Dfro, D. frontalis; Dmic, D. micans; Dpon, D. ponderosae; Dpun, D. punctatus; Eaer, Enterobacter aerogenes; Ebac,Enterobacteriaceae bacterium 9_2_54FAA; Eclo, Enterobacter cloacae; Ecol, Escherichia coli; Esp, Enterobacter sp.; Etas, Erwinia tasmaniensis; Koxy, Klebsiellaoxytoca; Kpne, Klebsiella pneumoniae; Ksp, Klebsiella sp.; Kvar, Klebsiella variicola; Pcar, Pectobacterium carotovorum; Pret, Providencia rettgeri; Prus,P. rustigianii; Raqu, Rahnella aquatilis; Rsp, Rahnella sp.; sacC, glycoside hydrolase; scrB, sucrose-6-phosphate hydrolase; Sfle, Shigella flexneri; Sodo,Serratia odorifera; Sply, Serratia plymuthica; Sson, Shigella sonnei; Ssp, Serratia sp.; Yald, Yersinia aldovae; Yent, Yersinia enterocolitica; Yfre, Yersiniafrederiksenii; Yroh, Yersinia rohdei. Branches with dots had greater than 80% bootstrap support. The tree was rooted with Bombyx mori beta-fructofuranosidase (Bmor-bFF, in blue).Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 7 of 19European spruce beetle (Ips typographus) ([38], NCBIGACR01000000), or for other Coleoptera such as thewhite pine weevil (Pissodes strobi; NCBI GAEO00000000;Wytrykush et al, in preparation) and T. castaneum. Wewere also able to amplify the orthologous scrB using pri-mers designed from the MPB and SPB scrB of the genomicDNA of the European great spruce beetle (Dendroctonusmicans) and the North American Allegheny spruce beetle(Dendroctonus. punctatus) (Figure 4B,C). Amplicons fromboth of these two species shared 91% amino acid identityto the corresponding region of the MPB srcB, and were97% identical to each other. These two spruce-infestingDendroctonus species are closely related to each other, butare phylogenetically distant from MPB [39] and have differ-ent contemporary geographical ranges in Europe andNorth America, respectively. Although the depth ofsequences available for other Scolytinae is limited and/ortissue-specific, the presence of this locus only in Dendroc-tonus species suggests that a horizontal bacterium-to-insectgene-transfer event may have occurred during or beforethe divergence of Dendroctonus species approximately 25to 40 Mya [40,41], but after expansion of Scolytinae,approximately 85 Mya [3,41] (Figure 4B).Most of the other gene models on the three scaffoldsthat contained Dpon-scrB matched to genes in T. casta-neum on LG4. This evidence, and the presence of two lociin the male assembly and one in female assembly, sug-gested that this gene was located on the ancestral auto-some portion of neo-X and neo-Y (Figure 4A). The closestmatch to an insect protein was a srcB-like protein fromBombus impatiens (XP_003494683, Bimp-scrB, Figure4C), with 42% identity and an e-value of 5 × 10-123, whichdue to its closest neighbors being bacterial proteins (enter-obacteria Erwinia spp., Providencia spp., and Yersiniaspp.), may also be the result of a horizontal gene transfer,as has been described in Bombyx mori [42].In bacteria, scrB catalyzes the hydrolysis of sucrose-6-phosphate to glucose-6-phosphate and fructose for car-bohydrate utilization via the phosphotransferase system[43]. If this gene is expressed in MPB and is translatedinto a functional enzyme, it may contribute to the meta-bolism of carbohydrates in beetles. The Sanger ESTsequences for this gene were obtained mainly fromcDNA libraries originating from adult midgut/fat bodytissue [24], while short-read RNA-seq data of separateadult midgut and fat body tissue (Keeling et al, in pre-paration) indicated expression in the midgut only. Thespecific expression of this horizontally acquired gene indigestive tissue suggests an adaptation of the beetle tofacilitate digestion of host pine and/or fungal or bacterialcarbohydrates. Based on the present analysis, in futurework it should be possible to test such a function androle in Dendroctonus using biochemical or RNA interfer-ence methods. Another member of the Scolytinae hasalso recently been shown to harbor a horizontally trans-ferred gene involved in carbohydrate metabolism. Thecoffee berry borer beetle (H. hampei) has a mannanase ofapparent bacterial (Bacillus) origin [44]. An ortholog ofthis gene could not be found in either MPB assembly.Single-nucleotide polymorphismsTo examine the variation and distribution of SNPs acrossthe genome, we mapped short-read sequences of genomicDNA from pooled beetles (sampled from seven locationsin Canada and one location in USA) to the male genomeassembly, and then identified SNPs. We found a total of1.69 million SNPs, with an allele that differed from theconsensus sequence at a frequency of at least 6.25% in thesequence data from pooled populations. A small numberof these, totaling 1.7% of all SNPs, were monomorphicrelative to the consensus sequence. In addition, 97.8% ofthe SNPs were dimorphic, 0.5% were trimorphic, and0.001% were tetramorphic. In total, 6.6% of the SNPs werefound in exonic regions, 16.0% in intronic regions, and77.4% in intergenic regions. These regions represent 8.4%,13.0%, and 78.6% of the non-N portions of the assembledgenome, respectively. On average, the SNP density was7.55 SNPs/kbp (Figure 5) but varied between exonic (5.92SNPs/kbp), intronic (9.27 SNPs/kbp), and intergenic (7.43SNPs/kbp) regions. Although comparative data are limitedin insects, MPB had a higher SNP density compared withthe horned beetle (Onthophagus taurus) and the varroato-sis mite (Varroa destructor), which average 5.67 SNPs/kbp[45] and 0.062 SNPs/kbp [46] respectively, but less thanthe 16.5 SNPs/kbp found in two Lycaeides butterfly spe-cies [47]. A comprehensive analysis of SNP variationacross the geographic range of MPB and between theabove eight populations is currently in progress.OrthologyThe total number of 13,088 gene models identified in themale MPB assembly was similar to the number (16,404)of gene models initially found in the T. castaneumgenome [2]. As this is only the second report of a beetlegenome, we compared the protein predictions in MPBagainst those of T. castaneum, Apis mellifera (honeybee), B. mori (silk moth), Drosophila melanogaster, andAcyrthosiphon pisum (pea aphid) by clustering them intoorthologous groups to determine if there were coleop-teran-specific orthologs, and whether the MPB genomehas signatures of expanded or contracted gene familiesrelative to T. castaneum. We found 413 protein groups,representing 1,055 predicted proteins, which were uniqueto MPB (Figure 6A). Of these predicted proteins, only 51had no measureable expression based upon RNA-seqdata from whole larvae and adult beetles (Keeling et al,in preparation). Of the 413 protein groups, 25% had nosimilarity to proteins in NCBI nr (e-value <1 × 10-5). TheKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 8 of 19remaining 75% were similar to, but not deemed so similaras to be orthologous to, known proteins, and two-thirdsof these were annotated to ‘hypothetical proteins’ in theother organisms.There were 6,663 orthologous groups shared betweenMPB and T. castaneum (Figure 6A). Of these, most (83%)had an n to n relationship between the number of MPB toT. castaneum proteins in each group, but 12% of thegroups contained more MPB proteins than T. castaneum,and 5% contained fewer MPB proteins than T. castaneum(Figure 6B). We found 371 groups that were present inboth MPB and T. castaneum but not in any of the othercompared species (Figure 6A), suggesting existence ofsome coleopteran-specific groups.Specific gene familiesMPB spends most of its life cycle, except for a brief per-iod of dispersal flight, under the bark of trees, where it isexposed to pine host defenses such as an abundance ofoleoresin terpenoids [48]. Lignified cell walls and barktissue that are rich in phenolics may also limit nutrientsavailable for growth, development, and reproduction ofthe beetle. Thus, we anticipated that MPB has adapted tothis hostile environment by evolving genes for detoxifica-tion of host pine chemicals and digestion of pine barkand wood tissue. We examined two gene families, theP450 cytochromes, and the glutathione S-transferases(GSTs), which are commonly involved in detoxificationof plant chemicals, and from which some members arelikely to be involved in the sequential pathway of metabo-lizing xenobiotics by making them more polar and excre-table. Although the microorganisms associated with MPBare thought to facilitate the digestion of plant cell wallsand lignins, and the concentration of nitrogen [49],insects have increasingly been shown to possess the abil-ity to degrade plant cell walls themselves [50,51] andthus we also examined the family of plant cell wall-degrading enzymes (PCWDEs).Cytochrome P450 gene familyThe P450 cytochromes are a large family of enzymesassociated with many processes of insect biology, parti-cularly hormone biosynthesis, detoxification of xenobio-tics (including plant defense compounds), pheromoneFigure 5 Single-nucleotide polymorphism (SNP) density across the scaffolds for eight populations of beetles. Inset shows a restrictedrange of SNP density.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 9 of 19biosynthesis, and insecticide resistance [52]. Insect P450sare found in four clades, CYP2, CYP3, CYP4, and mito-chondrial, and are grouped further into families and sub-families [52]. Members of CYP families share more than40% amino acid identity (for example, the CYP9s), whilemembers of subfamilies share more than 55% amino acididentity (for example, the CYP9Zs). For MPB, we found atotal of 7 CYP2, 47 CYP3, 22 CYP4, and 9 mitochondrialP450s (Figure 7). This total of 85 P450s found in theMPB genome was less than the 134 identified in T. casta-neum (8, 70, 44, and 9 in the respective clades), but itwas within the range of the number of P450s found inother sequenced insect genomes [52,53].We saw evidence for lineage-specific expansions(’blooms’ [53]) in the CYP4, and particularly in the CYP6and CYP9, MPB P450 families within the CYP4 and CYP3clades, respectively (Figure 7). These blooms did not haveorthologs in T. castaneum, nor did the blooms present inT. castaneum have orthologs in MPB. This pattern sug-gests that specific P450 family expansions have occurred inthe lineages of each beetle species as part of their adapta-tions to different environments. The CYP2 and mitochon-drial CYPs contained predominantly orthologous P450sacross the species compared, and included the conservedP450s for 20-hydroxyecdysone biosynthesis (DponCYP302A1, DponCYP306A1, DponCYP307A1, DponCYP307B1,DponCYP314A1, DponCYP315A1) and deactivation(DponCYP18A1) [54], juvenile hormone biosynthesis(DponCYP15A1) [53], cuticle formation (DponCYP301A1[55]), and circadian rhythm (DponCYP49A1 [56]).We found several instances where several P450s wereclustered on scaffolds, with nineteen clusters of two, fiveclusters of three, and one cluster of seven P450s. The clus-ter with seven P450s contained only CYP9Zs, and repre-sented approximately one-half of all the CYP9s found inthe MPB genome. In T. castaneum, five CYP9s were alsofound in a cluster. Very few CYP9s have been functionallycharacterized in any insect, but some can hydroxylatemonoterpenes [57]. Only one P450, DponCYP307A1(spook), was found on one of the putative ancestral × scaf-folds (Seq_1102713). Its ortholog in T. castaneum is foundon LG1=X.Based on the observed patterns of specific P450 bloomsin MPB, it is possible that these P450s have importantfunctions for MPB to survive in the hostile chemicalenvironment of the bark tissue of living pine trees. Asgenomes of other bark beetle species, including thosethat colonize dead trees or non-coniferous hosts, will besequenced in future work, it should become possible toreconstruct the evolution of the observed blooms and toidentify possible associations of such blooms with thecolonization by MPB of living coniferous host trees,Figure 6 Orthologs. Comparative analysis of orthologous protein groups between six sequenced insect genomes. Predicted proteins from thegenome sequences of mountain pine beetle (MPB), Tribolium castaneum, Apis mellifera (honey bee), Bombyx mori (silk moth), Drosophilamelanogaster, and Acyrthosiphon pisum (pea aphid) were clustered into orthologous groups with OrthoMCL [84]. (A) Venn diagram indicates thenumber of protein groups found in either one or both beetle species among the 12,156 orthologous groups found. Numbers in parenthesesindicate the percentage of these groups that were not found in any of the four non-beetle species. (B) Of the 6,663 groups found in bothbeetle species, 83% had an n to n correspondence between the two beetle species, whereas other groups had more or fewer members inmountain pine beetle (MPB) versus T. castaneum. The histogram indicates the distribution of the ratios of the number of MPB to T. castaneumproteins in each group.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 10 of 19which are particularly rich in terpenoid and phenolicdefenses. These diversified MPB P450 are also relevanttargets for functional characterization, which is currentlyunderway.Glutathione S-transferase gene familyGSTs are ubiquitous in organisms ranging from prokar-yotes to animals. Although their functions are diverse [58],typical GST-mediated reactions involve the conjugation ofglutathione with a substrate, often a toxin [59-62]. Theaddition of glutathione to the substrate increases the solu-bility of the substrate, aiding in detoxification and excre-tion. GSTs often act in concert with other enzymes, suchas cytochromes P450 and epoxide hydrolases, which cata-lyze earlier reactions that prepare the substrate for theGST-catalyzed reaction [61].0.1HsapCYP3A4DponCYP410A2DponCYP410A1DponCYP410C1DponCYP411A1DponCYP352B1TcasCYP352A1DponCYP434A1AmelCYP4AA1 like DponCYP4AA1TcasCYP4AA1AmelCYP4C3AmelCYP4C1BmorCYP4M5BmorCYP4M9TcasCYP4Q4TcasCYP4Q5TcasCYP4Q3TcasCYP4Q1TcasCYP4Q2TcasCYP4Q7v1TcasCYP4Q7v2TcasCYP4Q8TcasCYP4Q9DponCYP4BG3TcasCYP4Q6TcasCYP4BN1TcasCYP4BN4TcasCYP4BN2TcasCYP4BN3DponCYP4BQ1DponCYP4BQ2TcasCYP4BN11DponCYP4BD6DponCYP4BD4vNDponCYP4BD4v1DponCYP4BD4v2TcasCYP4BN7TcasCYP4BN8TcasCYP4BN10TcasCYP4BN9TcasCYP4BN6TcasCYP4BN5DponCYP433A1BmorCYP4L6TcasCYP4BR1DponCYP4BR4TcasCYP4BR3BmorCYPXiiiDponCYP4G55TcasCYP4G14AmelCYP4G11BmorCYP4G25DponCYP4G56TcasCYP4G7TcasCYP4BM1DponCYP4CV1DponCYP4CV2DponCYP349B1DponCYP349B2TcasCYP349A1TcasCYP349A2TcasCYP350C1TcasCYP350A1TcasCYP351D1TcasCYP351B1TcasCYP351C1TcasCYP351A8TcasCYP4G25TcasCYP4G25iBmorCYP366A1TcasCYP315A1DponCYP315A1v1DponCYP315A1v2BmorCYP315A1AmelCYP315A1iAmelCYP315A1iiBmorCYP302A1AmelCYP302A1DponCYP302A1TcasCYP302A1TcasCYP12H1AmelCYP301A1DponCYP301A1TcasCYP301A1DponCYP49A1TcasCYP49A1AmelCYP49A1DponCYP301B1TcasCYP301B1AmelCYP12A5BmorCYP339A1TcasCYP334B1DponCYP334E1DponCYP334E2TcasCYP353A1BmorCYP314A1AmelCYP314A1TcasCYP314A1DponCYP314A1AmelCYP307A1DponCYP307B1TcasCYP307B1BmorCYP307A1DponCYP307A1TcasCYP307A1AmelCYP18A1BmorCYP18A1DponCYP306A1TcasCYP18A1BmorCYPXiiAmelCYP306A1DponCYP18A1TcasCYP306A1AmelCYP304A1TcasCYP304E1BmorCYPXiDponCYP305F1TcasCYP305A1AmelCYP305A1iiBmorCYP305B1AmelCYP305A1iiiDponCYP15A1TcasCYP15A1AmelCYP305A1iAmelCYP303A1DponCYP303A1 TcasCYP303A1BmorCYP332A1BmorCYP337A1TcasCYP348A1DponCYP393A1DponCYP393A2TcasCYP9AB1TcasCYP9Z6TcasCYP9Z7TcasCYP9Z1TcasCYP9Z4TcasCYP9Z2TcasCYP9Z3TcasCYP9Z5TcasCYP9AA1DponCYP9AN1DponCYP9AP1DponCYP9AZ1DponCYP9Z18DponCYP9Z20DponCYP9Z37DponCYP9Z38DponCYP9Z22DponCYP9Z19DponCYP9Z36DponCYP9Z21DponCYP9Z35DponCYP9Z34DponCYP9Z23DponCYP9Z24TcasCYP9Y1TcasCYP9D2TcasCYP9D1TcasCYP9D7TcasCYP9D4TcasCYP9D8TcasCYP9W1TcasCYP9AF1TcasCYP9X1TcasCYP9AC1TcasCYP9AD1BmorCYP9G3BmorCYP9A22BmorCYP9A20BmorCYP9A19BmorCYP9A21AmelCYP9E2 likeAmelCYP9E2iiiAmelCYP9E2 isoform1AmelCYP9E2 isoform2AmelCYP9E2 isoform3AmelCYP9E2 isoform4AmelCYP9E2vAmelCYP9E2iiAmelCYP9E2ivAmelCYP6AQ1AmelCYP6K1iiTcasCYP345A1TcasCYP345A2DponCYP345E1DponCYP345E3DponCYP345E2TcasCYP345B1TcasCYP345C1DponCYP345F1TcasCYP345D1TcasCYP345D2AmelCYP6A13 likeiAmelCYP6A8AmelCYP6A14ivAmelCYP6A13iiiAmelCYP6A14 likeiiAmelCYP6A14iiiAmelCYP6A13 likeiiAmelCYP6A13iAmelCYP6A14 isoform1iAmelCYP6A14ixAmelCYP6A14viAmelCYP6A14vAmelCYP6A14 likeiAmelCYP6A14iAmelCYP6A14viiAmelCYP6A13iiAmelCYP6A14 isoform1iiAmelCYP6A14iiAmelCYP6A14viiiAmelCYP6A1AmelCYP6AS5BmorCYP6AU1BmorCYP6B29BmorCYP6AB4BmorCYP6AB5BmorCYP6AE9BmorCYP6AE2BmorCYP6AE8BmorCYP6AE21iBmorCYP6AE21iiBmorCYP6AE7BmorCYP6AE22AmelCYP9E2iTcasCYP347A1TcasCYP347A4DponCYP347D1DponCYP347B1DponCYP347E1DponCYP6CR2DponCYP6CR1DponCYP6CR3DponCYP6CR4TcasCYP6BQ13TcasCYP6BQ2TcasCYP6BQ4TcasCYP6BQ1TcasCYP6BQ6TcasCYP6BQ7TcasCYP6BQ12TcasCYP6BQ8TcasCYP6BQ9TcasCYP6BQ10TcasCYP6BQ11TcasCYP6BQ5AmelCYP6K1iTcasCYP346B3TcasCYP346B1TcasCYP346B2TcasCYP346A1TcasCYP346A2DponCYP6BX1DponCYP6DK1DponCYP6DH3DponCYP6DH1DponCYP6DH2DponCYP6BW6DponCYP6BW1DponCYP6BW3DponCYP6BW2DponCYP6BW4DponCYP6DE4DponCYP6DE2DponCYP6DE1DponCYP6DE3DponCYP6DF1DponCYP6BS2TcasCYP6BS1TcasCYP6BR1TcasCYP6BR2TcasCYP6BR3DponCYP6DG1DponCYP6DJ1DponCYP6DJ2TcasCYP6BT1TcasCYP6BM1TcasCYP6BN1TcasCYP6BP1TcasCYP6BK13TcasCYP6BK16PTcasCYP6BK1TcasCYP6BK2TcasCYP6BK11TcasCYP6BL1TcasCYP6BK3TcasCYP6BK4TcasCYP6BK7TcasCYP6BK15PTcasCYP6BK6TcasCYP6BK10TcasCYP6BK17CYP3 CladeCYP4 CladeMito. CladeCYP6CYP9CYP4CYP2 CladeDendroctonus ponderosaeTribolium castaneumApis melliferaBombyx moriCYP6Figure 7 Phylogeny of P450s. Phylogeny of MPB (Dendroctonus ponderosae, Dpon) P450s with those from the honey bee (Apis mellifera, Amel),silk moth (Bombyx mori, Bmor), and red flour beetle (Tribolium castaneum, Tcas). Red arcs indicate areas of expansion of the mountain pinebeetle (MPB) CYP4, and both CYP6 and CYP9 P450 families within the CYP4 and CYP3 clades, respectively. Branches with dots had greater than80% bootstrap support. The tree was rooted with human (Homo sapiens, Hsap) CYP3A4.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 11 of 19Insects have six known classes of cytosolic GSTs(designated Delta, Epsilon, Omega, Sigma, Theta, andZeta) plus a few individual entities that do not fit neatlyinto the current classification [63]. The Delta and Epsilonclasses are unique to insects, are thought to be generallyinvolved in detoxification reactions, and often contain alarge number of representatives [58,61-63].We found a total of 28 GSTs in the MPB genome(Figure 8), representing each of the six major classes. Spe-cifically, we found 6 Delta GSTs, 12 Epsilon GSTs, 2Omega GSTs, 5 Sigma GSTs, 2 Theta GST, and 1 ZetaGST. In many cases, the MPB GSTs had close orthologsin T. castaneum, which contains 3, 18, 3, 6, 1, and 1 GSTs,respectively, of the aforementioned families but obviousorthology was not always the case. The Epsilon class ofMPB GSTs contained three small groups of genes(DponGSTe1, DponGSTe2, and DponGSTe3; DponGSTe4and DponGSTe5; and DponGSTe6 and DponGSTe7) with-out orthologs in T. castaneum. Similarly, DponGSTd3 andDponGSTd6 did not have close T. castaneum orthologs.These GSTs without orthologs in T. castaneum may indi-cate an expansion of the GSTs in MPB or a contraction inT. castaneum.As is a common phenomenon in gene families that arosefrom gene-duplication events, 16 of the 28 GSTs werefound in clusters on genomic scaffolds. One small clustercontained two Sigma class GSTs (DponGSTs1 andDponGSTs2), while another cluster of two genes containedthe only Zeta class GST found in MPB, along with one ofthe two Theta class GSTs (DponGSTt1 and DponGSTz1).The remaining two larger clusters contained all of theEpsilon GSTs; one contained five GSTs (DponGSTe2,DponGSTe3, DponGSTe4, DponGSTe5, and DponGSTe8),and the other contained the remaining seven GSTs(DponGSTe1, DponGSTe6, DponGSTe7, DponGSTe9,DponGSTe10, DponGSTe11, and DponGSTe12).The 18 Epsilon and Delta GSTs identified in MPB indi-cate their likely importance in detoxifying the defensemetabolite-infused pine tissue in which the larval andadult beetles exist. Whereas T. castaneum has had sub-stantial exposure to pesticides in recent history, MPB hasnot. This may allow comparison of functions of orthologsbetween the two species, with the aim of understandingthe development of GST-associated pesticide resistance.Some of the novel MPB GSTs, particularly from theDelta and Epsilon classes, will be investigated in futurework for potential detoxification roles against secondarymetabolites in pine.Plant cell wall-degrading enzymesPlant cell walls are composed primarily of lignin and thecarbohydrates cellulose, hemicellulose, and pectin.Microorganisms are effective at metabolizing these plantcell-wall carbohydrates for energy by using PCWDEs.MPB must be able to metabolize the cell walls of thewoody pine species upon which it develops, either withthe help of associated microorganisms, or on its own.Until recently, PCWDEs were thought to be absent ininsects, because the sequenced model organisms such asD. melanogaster and B. mori lacked these genes. How-ever, recent work has shown that PCWDEs are in factboth present and diverse in insects [50,51], particularlyin the Coleoptera.We manually annotated approximately 80 gene mod-els coding for PCWDEs that were at least 100 aminoacids long in the male genome assembly. These werereduced to 52 non-redundant PCWDEs by comparingprotein translations including: six glycoside hydrolasefamily 48 proteins, seven polysaccharide lyase family 4proteins, eight endo-b-1,4-glucanases, nine pectinmethylesterases, and twenty-two endopolygalacturo-nases. Compared with a previous analysis [50] of ourMPB Sanger EST data [24], we identified an additionalnine PCWDEs in the MPB genome: two pectin methy-lesterases, two polysaccharide lyase family 4 proteins,and five endopolygalacturonases. In addition, we foundthree endopolygalacturonases that are probably pseudo-genes. An ortholog to the T. castaneum glycosidehydrolase family 9 protein could not be found in eitherthe MPB genome or transcriptome, although orthologshave been found in other insect species [51]. Althoughthe identification of PCWDEs in insects has been lim-ited, and has been studied most thoroughly in Coleop-tera [50], MPB now has the largest family of PCWDEsdescribed to date. In addition to understanding theirroles in metabolizing plant cell walls for the nutrition ofthe beetle, and the relative role they play compared withthat of the associated microorganisms in degrading thepine host tissues, there is also interest in the discoveryof new insect PCWDEs for use in biotechnologicalapplications, such as the degradation of cellulose forconversion of plant cell-wall sugars into biofuels orother bioproducts [64].ConclusionsOur demonstration of a successful de novo assembly of adraft genome sequence from short-read sequences froman individual insect without inbreeding to reduce het-erozygosity is a forerunner to projects that aim tosequence non-model insects, such as in the i5k initiative[65]. The genome sequence of MPB provides a novelresource in Curculionoidea, many species of which areeconomically and ecologically important pests in forestryand agriculture. It also provides comparative data forthe T. castaneum genome sequence to study the evolu-tion of Coleoptera and insects in general. Our initialefforts to identify portions of the assembly correspond-ing to the sex chromosomes provide the frameworkKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 12 of 19for more comprehensive linkage studies using otherapproaches.Our analyses of the MPB genome sequence identifiedunique features that may benefit the beetle for living ina plant defense-rich and nutrient-poor environmentover a large geographic and bioclimatic range. The largenumber of SNPs identified here, when examined morethoroughly across multiple populations, will provideinsights into the structure and natural variation in MPBpopulations throughout western North America, andmay facilitate understanding of how the beetle is adapt-ing to new hosts, new geographical ranges, and climaticconditions.The presence of a bacterial sucrose-6-phosphate hydro-lase in the genome sequence of MPB, which is also pre-sent in other Dendroctonus species with different hostsand different geographical ranges, and its tissue-specificexpression in MPB midguts, suggests a unique functional0.1HsapGSTA1DponGSTs2DponGSTs1DponGSTs3PhumPHUM284550SigmaTcasTC002878SigmaNvitGSTS4AmelGSTS3NvitGSTS7NvitGSTS6NvitGSTS5AmelGSTS1PhumPHUM284560SigmapartialPhumPHUM009630SigmaPhumPHUM284770SigmaApisACYPI002127SigmaApisACYPI002679SigmaApisACYPI009326SigmaApisACYPI000794SigmaApisACYPI009519SigmaAaegGSTs1CquiCPIJ006160SigmaDmelCG8938GSTS1AgamGSTS1CquiCPIJ006159SigmaDponGSTs5TcasTC000067SigmaTcasTC003232SigmaTcasTC003233SigmaTcasTC003231SigmaTcasTC003496SigmaDponGSTs4AmelGSTS4NvitGSTS2AmelGSTS2NvitGSTS1NvitGSTS3NvitGSTS8BmorGSTs1 BmorGSTs2BmorGSTz2partialdivergentDmelCG9362ZetaPhumPHUM236630ZetaBmorGSTz1AmelGSTZ1TcasTC009842ZetaDponGSTz1AgamGSTZ1AaegGSTz1DmelCG9363ZetaTcasTC003336N−termAgamGST−AGAP006132CquiCPIJ009240GSTAaegN−term−AAEL009602AaegGST−AAEL015336BmorGSTo3NvitGSTO2Amelpartial−GB19678DponGSTo1TcasTC000055OmegaTcasTC000054OmegaApisACYPI42456OmegapartialApisACYPI008340OmegaAgamGSTO1AaegGSTo1CquiCPIJ000031OmegaDmelCG6776OmegaDmelCG6781OmegaDmelCG6662OmegaDmelCG6673OmegaTcasTC003873OmegaDponGSTo2PhumPHUM530530OmegaNvitGSTO1AmelGSTO1BmorGSTo1BmorGSTo2BmorGSTo4AmelGST−GB10031BmorGSTt1ApisACYPI009122ThetaApisACYPI007233ThetaNvitGSTT1AmelGSTT1NvitGSTT3NvitGSTT2DmelCG1702ThetaDmelCG30000ThetaDmelCG30005ThetaAgamGSTT1CquiCPIJ014051ThetaAaegGSTt1CquiCPIJ014052ThetaAaegGSTt3CquiCPIJ014053ThetaAgamGSTT2AaegGSTt2CquiCPIJ014054ThetaDmelCG1681ThetaAaegGSTt4CquiCPIJ019572ThetaCquiCPIJ020053ThetaDponGSTt1TcasTC006215ThetaPhumPHUM454040ThetaDponGSTt2DponGSTd6PhumPHUM333090DeltaBmorGSTu1partialDmelCG33546Delta/EpsilonSuperclassAgamGSTU1AaegGSTi1CquiCPIJ009434DeltaTcasTC000522DeltadivergentDponGSTd1DponGSTd2DmelCG4688EpsilonAaegGSTe8AgamGSTE8/U4CquiCPIJ018633EpsilonDponGSTe9DponGSTe10TcasTC003347EpsilonpartialTcasTC003104EpsilonpartialTcasTC004443EpsilonTcasTC004442EpsilonTcasTC004444EpsilonTcasTC004445EpsilonTcasTC004942EpsilonTcasTC004446EpsilonTcasTC004941EpsilonTcasTC004448EpsilonTcasTC004447EpsilonTcasTC004940EpsilonTcasTC003345EpsilonDmelCG17522GSTE10DmelCG17534GSTE9DmelCG5164GSTE1DmelCG17523GSTE2DmelCG17524GSTE3DmelCG17525GSTE4DmelCG17531GSTE7DmelCG17530GSTE6DmelCG17527GSTE5DmelCG17533GSTE8DmelCG11784EpsilonDmelCG5224EpsilonDmelCG16936EpsilonAaegGSTe2CquiCPIJ018630EpsilonAaegGSTe5CquiCPIJ018628EpsilonpartialCquiCPIJ018629EpsilonAgamGSTE2CquiCPIJ018627EpsilonAgamGSTE1AgamGSTE7AgamGSTE6AaegGSTe1AaegGSTe6AgamGSTE5AgamGSTE4AaegGSTe4CquiCPIJ018631EpsilonCquiCPIJ018632EpsilonAaegGSTe7CquiCPIJ018624EpsilonCquiCPIJ018626EpsilonCquiCPIJ018625EpsilonAaegGSTe3AgamGSTE3TcasTC004450EpsilonDponGSTe8TcasTC003103EpsilonDponGSTe12TcasTC003348EpsilonTcasTC004449EpsilonDponGSTe11DponGSTe5DponGSTe4BmorGSTe2BmorGSTe3BmorBGIBMGA006639partialDponGSTe7DponGSTe6TcasTC003346EpsilonBmorGSTe4BmorBGIBMGA001051partialBmorGSTe5DponGSTe3DponGSTe2DponGSTe1AgamGSTU2AaegGSTx1CquiCPIJ016212DeltaDmelCG17639DeltaAaegGSTd7CquiCPIJ002660DeltaAgamGSTD7BmorGSTd1BmorBGIBMGA006538DeltaApisACYPI52132partialAgamGSTU3AaegGSTx2CquiCPIJ014694DeltaPhumPHUM097960DeltapartialTcasTC009482EpsilonDponGSTd5BmorBGIBMGA001571partialDponGSTd3NvitGSTD5NvitGSTD4PhumPHUM189440DeltaAaegGSTd4CquiCPIJ002674DeltaCquiCPIJ002675DeltaCquiCPIJ002676DeltaCquiCPIJ002678DeltaCquiCPIJ002679DeltaCquiCPIJ000304DeltaAgamDelta−GST−AGAP004172AgamGSTD6AaegGSTd6CquiCPIJ002683DeltaAaegGSTd3CquiCPIJ002677C−termAgamGSTD12AgamDelta−GST−AGAP012838AgamGSTD3AgamGSTD4AgamDelta−GSTpartial−AGAP012839AgamGSTD5AgamDelta−GST−AGAP012702AgamGSTD10AgamGSTD8AgamGSTD11CquiCPIJ002681DeltaCquiCPIJ002682DeltapartialCquiCPIJ002680DeltaCquiCPIJ010814DeltapartialAaegGSTd11ApisACYPI006598DeltapartialApisACYPI001068DeltaApisACYPI38440partialApisACYPI008550DeltaApisACYPI006899DeltaApisACYPI52302DeltaApisACYPI26400partialApisACYPI009586DeltaApisACYPI31490partialApisACYPI37729partialApisACYPI008042DeltaApisACYPI008657DeltaApisACYPI005620DeltaNvitGSTD3PhumPHUM189430DeltaBmorGSTd3BmorGSTd2TcasTC007571DeltaDponGSTd4NvitGSTD1AmelGSTD1AgamGSTD1AaegGSTd2AgamGSTD2AaegGSTd5CquiCPIJ002663DeltaCquiCPIJ002661DeltaAaegGSTd1AaegDelta−GST−AAEL006764DmelCG18548GSTD10DmelCG10091GSTD9DmelCG10045GSTD1DmelCG4421GSTD8DmelCG4423GSTD6DmelCG4371GSTD7DmelCG4181GSTD2DmelCG11512GSTD4DmelCG12242GSTD5DmelCG4381GSTD3DeltaSigmaZetaEpsilonThetaOmegaDendroctonus ponderosaeTribolium castaneumHymenopteraLepidopteraDipteraHemipteraPhthirapteraFigure 8 Phylogeny of glutathione S-transferases (GSTs). Branches with dots had > 80% bootstrap support. The tree was rooted with humanHsapGSTA1.Aaeg, Aedes aegypti; Agam, Aedes gambiae; Amel, Apis mellifera; Apis, Acyrthosiphon pisum; Bmor, Bombyx mori; Cqui, Culexquinquefasciatus; Dmel, Drosophila melanogaster; Dpon, Dendroctonus ponderosae; Hsap; Homo sapiens; Nvit, Nasonia vitripennis; Phum, Pediculushumanus; Tcas, Tribolium castaneum.Keeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 13 of 19integration of a horizontally transferred gene for carbo-hydrate utilization by MPB. Additional studies are neces-sary to confirm the role of this horizontally transferredgene in host colonization as a complement to thePCWDEs that were found in abundance in the MPBgenome.Some members of the P450 and GST gene families areprobably involved in the detoxification of host defensecompounds and in the production of pheromones thatfacilitate mass attack of the tree. The specific areas ofexpansion of these gene families relative to T. castaneumand other insect species hint at their roles in the specificbiological processes necessary for MPB to be such a sig-nificant pest of pines.MethodsOrigin of beetles and DNA extractionPupae were chosen as the DNA source to minimize con-taminating DNA from gut microorganisms and hosttissue, because larvae void their gut prior to pupation.For PET sequencing, genomic DNA was extracted froman individual wild-collected male pupa removed from abolt of a lodgepole pine (P. contorta) tree felled fromalong the Kay Kay Forest Service Road northwest ofPrince George, BC, Canada (approx. N 54 02.731’ W 12319.109’) in the fall of 2006. The sex of the individualpupa was established as male based upon microsatellitemarkers [66] on the isolated genomic DNA and the mar-kers confirmed in the assembly. MPET sequencing forscaffolding used genomic DNA extracted and combinedfrom 45 pupae, which were removed from one pine boltobtained from a different tree, but felled at the same timeand location. For sequencing the female genome, geno-mic DNA was extracted from an individual adult insectthat emerged from a bolt collected at the same time andlocation as the pupae. Voucher specimens from the samecohort of insects have been submitted to the E H Strick-land Entomological Museum at the University of Albertaand the Beaty Biodiversity Museum at the University ofBritish Columbia. Frozen beetle tissue was homogenizedusing a cell disrupter (BeadBeater; Bio-Spec, Bartlesville,OK, USA) and the genomic DNA extracted (DNeasyMini Plant Extraction Kit; Qiagen Inc., Valencia, CA,USA).Library construction and sequencingFor the male genome, two PET libraries with average frag-ment sizes of 590 and 630 bp were prepared with a com-mercial kit (Paired-End DNA Sample Prep Kit; IlluminaInc., San Diego, CA, USA) following the manufacturer’sprotocol (Paired-End Library Construction). For thefemale genome, one PET library with an average fragmentsize of 425 bp was prepared using the same protocol but adifferent kit (NEBNext Kit; New England Biolabs, Beverly,MA, USA). For scaffolding, three MPET libraries withaverage fragment sizes of 6,600, 10,000, and 12,000 bpwere prepared (Mate Pair Library Prep Kit, version 2; Illu-mina) following the manufacturer’s protocols but withsome in-house modifications for the larger fragment sizes.The male sequencing and sequencing for scaffolding werecompleted on two different sequencing systems (GAII andHiSeq; both Illumina) and the female sequencing wascompleted on the HiSeq system.Assembly methodsMale assemblyThe sequences generated from the two PET libraries forthe male pupa and three MPET libraries for the pooledpupae were assembled using ABySS (version 1.3.0 [67]).The paired-end libraries were used for the de Bruijngraph assembly and paired-end assembly. The mate-pairlibraries were used to scaffold the assembly, but werenot used for their sequence data because of the chimericreads produced by mate-pair libraries, and because theDNA originated from pooled beetles.Mate-pair libraries contain a mixture of large-frag-ment reads that originate from DNA fragments contain-ing the biotin label and of short-fragment reads thatoriginate from DNA fragments that do not contain thebiotin label. To enrich for large-fragment reads, wealigned the mate-pair reads to an earlier stage of theassembly, and removed read pairs that aligned with for-ward-reverse orientation rather than the expectedreverse-forward orientation.The ABySS assembly parameters were set to k = 64 forthe de Bruijn graph assembly, s = 500 and n = 10 for thepaired-end assembly, s = 1100 and n = 25 for scaffoldingwith the 6 kbp mate-pair library, and s = 3400 and n = 3for scaffolding with the 10 kbp and 12 kbp mate-pairlibraries. The parameter k is the size of a de Bruijn graphk-mer. The parameter s is the minimum size contig toconsider placing in a scaffold, and the parameter n is theminimum number of paired read links required to mergetwo contigs in a scaffold.Similar but not identical sequences of the genome aredifficult to assemble, as they cause branches in theABySS assembly graph referred to as bubbles. For a typi-cal genome assembly, bubbles may be caused by near-repeats or by heterozygous variation. The MPB assemblyfaced these typical issues as well as the divergence of theneo-X and neo-Y chromosomes. The bubble-poppingalgorithm of the de Bruijn graph assembly stage ofABySS popped bubbles that were shorter than 3k inlength, allowing for two SNV within k of each other.SNVs separated by more than k formed two separatebubbles, which were popped individually. After the deBruijn graph assembly was performed, bubbles longerthan 3k were popped if the two branches of the bubbleKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 14 of 19had a minimum 90% identity. When a bubble waspopped, the sequence with the most coverage wasselected to represent the bubble. The other branch of thebubble was stored in an auxiliary file.A sequence overlap graph represents each sequence asa vertex and each overlap of two sequences as a directededge. For a simple bubble between two vertices u and v,the out-neighborhood of vertex u, N+(u), is identical tothe in-neighborhood of vertex v, N-(v) (see Additionalfile 2, Figure 2A). For a complex bubble, all paths startingfrom vertex u must eventually pass through vertex v, butthe structure of the subgraph inside the bubble is morecomplex than a simple bubble (see Additional file 2,Figure 2B).ABySS identified complex bubbles when the bubblesubgraph was a directed acyclic graph. Rather than picka representative sequence or consensus sequence,ABySS scaffolded over the complex bubble by replacingit with a span of Ns in the assembly, whose lengthrepresented the longest path through the bubble. Bothsimple bubble-popping and scaffolding over complexbubbles increased the N50.EST sequences [24] and contigs assembled by ABySSfrom RNA-seq data were aligned to the genomic scaffolds,and expressed sequences that spanned multiple scaffoldswere used as evidence to merge those scaffolds, requiringa minimum of two expressed sequences from differentlibraries supporting the same merge. The two scaffolds insuch merges were separated by a gap of 100 Ns. Finally,we used Anchor 0.2.7 [68] to correct small misassemblies,reduce or close scaffold gap sequences, and extend theends of scaffold sequences. Only the paired-end librarieswere used with Anchor.Female assemblyThe female assembly was generated in a similar manner tothe male assembly, except that ABySS version 1.3.3 withk = 96 and Anchor version 0.3.1 were used. The sameMPET sequences as for the male assembly were used forscaffolding.AnnotationThe male MPB genome assembly was initially annotatedusing the MAKER [69] pipeline, limiting annotation toscaffolds over 1 kb in length. This software synthesizesthe results from ab initio gene predictors with experi-mental gene evidence to produce final annotations.Within the MAKER framework, RepeatMasker [70] wasused to mask low-complexity genomic sequence basedon the RepBase Coleoptera repeat library [71]. Alsowithin the MAKER framework, AUGUSTUS [72], Snap[73], and GeneMark [74] were run to produce ab initiogene predictions. AUGUSTUS predictions were based onthe included T. castaneum training set of genes, whereasSnap gene predictions were based on its own minimaltraining set, and GeneMark was self-trained. These threesets of predictions were combined with the BLASTx [75],BLASTn [75], and exonerate [76] alignments of 178,536clustered EST sequences [24], 12 RNA-seq libraries(Keeling et al, in preparation) assembled with ABySS,and protein sequences from T. castaneum [2] to producethe final annotations. Known protein domains were thenfurther annotated using InterProScan [77]. Both assem-blies were also compared with the ultra-conserved coreeukaryotic gene dataset [29] to assess completeness.Analysis of sex chromosomes and shared synteny withthe T. castaneum genomeLacking linkage or physical maps, we attempted to iden-tify portions of the sex chromosomes (especially theancestral × portion of the neo-X chromosome) in silico.First, we mapped the genomic sequencing reads from theindividual male pupa to the assembly with BWA [78] andthen called SNVs with mpileup in SAMtools [79] with aminimum of 10-fold coverage. Scaffolds originating fromthe ancestral × portion of neo-X would be expected tohave very low (theoretically zero, excluding sequencingerrors) SNV density, as only one copy would be presentin a male genome, whereas scaffolds originating frommerged portions of the ancestral autosome portion ofneo-X and neo-Y would be expected to have high SNVdensity because there are two copies in the male (fromneo-X and from neo-Y). We hypothesized that the latterwould be more divergent than autosomes because ofreduced recombination, and that scaffolds originatingfrom the ancestral × portion of neo-X would be longeron average because of more efficient assembly resultingfrom the presence of only one allele.The level of shared synteny of the MPB assemblies tothe linkage groups in T. castaneum was determined bytBLASTx (e-value <1 × 10-20), and the resulting HSPsdrawn in R [80]. To compare the scaffolds of the putativeancestral × portion of the male MPB assembly withLG1 = × in T. castaneum, we used BLASTp (e-value <1 ×10-25) with the translated MPB gene models to identifythe most similar translated T. castaneum gene models.We then determined the linkage group and position ofthe T. castaneum gene models using data from NCBI andBeetleBase [81], and plotted the matching pairs in Omni-Graffle Pro (version 5.4) using a custom Applescript.Analysis of repetitive elementsKnown repetitive elements in the scaffolds longer than1,000 bp in each genome assembly were identified byRepeatMasker (version open-4.0.0 [70]) run with rmblastn(version 2.2.23+) against the arthropod repeats withinRepBase Update (20120418) [71]. After these were maskedin the assemblies, novel repetitive elements were identifiedby RepeatScout (version 1.0.5) [82], and those appearing atKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 15 of 19least 10 times in the genome were counted by RepeatMas-ker with assembly-specific repeat libraries. To accesswhether the novel repetitive elements identified in MPBcould also be found in T. castaneum, we used RepeatMas-ker with the MPB assembly-specific repeat libraries afterthe T. castaneum genome was similarily masked witharthropod repeats within RepBase Update.Analysis of horizontal gene transferTo confirm the presence of sucrose-6-phosphate hydro-lase (scrB) in the genome of MPB, we used a reverse pri-mer in this gene and a forward primer in the adjacenthypothetical gene (Table 3) to amplify contiguous por-tions of these genes and the intervening intergenic regionfrom genomic DNA. To determine whether this scrB ispresent in other Dendroctonus species, we used internalprimers (Table 3) based upon the DNA and mRNAsequences of scrB from MPB and the mRNA sequencefrom SPB to amplify a segment approximately 1,100 bpin length with DNA from MPB, D. micans, and D. punc-tatus. The amplicons were cloned into pJET1.2 (Fermen-tas) and fully sequenced.Genome-wide SNP analysesTo examine the variation and distribution of SNPs acrossthe genome, we mapped short-read sequences of geno-mic DNA from pooled beetles (nine to fourteen beetlesper pool) sampled from seven locations in Canada (threein BC; four in AB) and one location from USA (SD) (seeAdditional file 4, Table 2) to the draft genome assembly,and then identified SNPs. The sequences from all popula-tions were mapped as one dataset to the male assemblyusing CLC Genomics Workbench (version 5.0.1) usingthe following parameters: similarity 0.9; length fraction0.5; insertion, deletion, and mismatch cost 3; min/maxpaired distance 200/600. The SNPs were then detectedusing the following parameters: window length 51, mini-mum quality 20, minimum coverage 20, required variantcount 3, minimum variant frequency 6.25%.Interspecies comparisonTo compare the MPB genome with other sequencedinsect genomes for orthologous groups and potentialgene-family expansions, we obtained protein sequencesfor B. mori, A. pisum, A. mellifera, D. melanogaster, andT. castaneum from the OrthoMCL-DB [83] and NCBIgenomes FTP site, and compared them with the genemodels of MPB using OrthoMCL [84] using default para-meters and an e-value of less than 1 × 10-10.Manual annotation of specific gene familiesUsing reciprocal BLAST against NCBI nr and genefamily-specific datasets, the gene families of cytochromesP450, GSTs, and PCWDEs were identified in the genomeassembly. Each gene model was manually annotated, andthen the non-redundant translated proteins were alignedwith MUSCLE [85] to the corresponding proteins fromseveral other insect species for which genomes have beensequenced. A maximum-likelihood phylogeny was cre-ated with FastTree 2 [86], and drawn with iTOL [87].Data accessThese Whole Genome Shotgun projects have been depos-ited at DDBJ/EMBL/GenBank under accession numbers[APGK00000000] (male) and [APGL00000000] (female).The versions described in this paper are the first versions,[APGK01000000] and [APGL01000000]. The raw sequencedata have been submitted to NCBI SRA with BioProjectID SRP014975, the assemblies have been submitted toNCBI with BioProject IDs PRJNA162621 (male) andPRJNA179493 (female), and the Tria Project is representedby NCBI umbrella BioProject PRJNA169907.Additional materialAdditional file 1: Supplementary Figure 1 Schematic of origin of neo-Xand neo-Y in mountain pine beetle (MPB).Additional file 3: Supplementary Table 1 Repeat analyses.Additional file 2: Supplementary Figure 2 Schematic of (A) simplebubbles and (B) complex bubbles.Additional file 4: Supplementary Table 2 Localities of mountain pinebeetle (MPB) samples used for genomic DNA sequencing for single-nucleotide polymorphism (SNP) analysis.AbbreviationsEST: Expressed sequence tag; FTP: File transfer protocol; GST: Glutathione S-transferase; MPB: Mountain pine beetle; MPET: Mate-paired end tag; Mya:Million years ago; NCBI: National Centre for Biotechnology Information;PCWDE: Plant cell wall-degrading enzymes; PET: Paired end tag; RNA-seq:RNA sequencing; scrB: Sucrose-6-phosphate hydrolase; SNP: Single-nucleotide polymorphism; SNV: Single-nucleotide variant.Authors’ contributionsCIK, SJMJ, and JB conceived of the study. CIK, ML, and HH preparedgenomic DNA, and investigated the horizontally transferred gene by PCRand sequencing. YZ, PP, and RM prepared the sequencing libraries anddirected the sequencing. NYL, TRD, SKC, GAT, DLP, SDJ, IB, and SJMJ refinedABySS, and subsequently assembled and annotated the genome. CIK, MMSY,and AN completed the analyses for shared synteny and the sexchromosomes. CIK, MMSY, and DPWH manually annotated and analyzed thedescribed gene families. CIK, MMSY, JKJ, and FAHS completed the SNPanalyses. CIK and JB wrote the manuscript. All authors read and approvedthe final manuscript.Table 3 Primer sequencesName Direction Sequencehypo Forward GGTGCTGCCTTTTCTTTGCTATTTscrB Reverse CCCAATAACCACATACCAAGACCscrB (internal) Forward CCAACATGGCTGGATGAATGACCCReverse CCTGAGCGCCTCCGTCTCTTTCKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 16 of 19Received: 10 October 2012 Revised: 8 March 2013Competing interestsThe authors declare that they have no competing interests.AcknowledgementsWe thank Ms Karen Reid (UBC) for excellent laboratory and project-management support, David R Nelson (University of Tennessee HealthScience Center) for his assistance with the naming of the P450s, Erin Clark(UNBC) for the MPB beetle samples, and François Mayer and Jean-ClaudeGrégoire (Université Libre de Bruxelles) for D. micans and D. punctatusgenomic DNA. This work was supported with funds from Genome Canada,Genome British Columbia, and Genome Alberta in support of the Tria 1 andTria 2 projects (http://www.thetriaproject.ca) to CIK, DPWH, FAHS, SJMJ, andJB, the BC Ministry of Forests, Lands and Natural Resource Operations, theNatural Sciences and Engineering Research Council of Canada (NSERC EWRSteacie Memorial Fellowship and NSERC Strategic Project Grant to JB), andthe British Columbia Knowledge Development Fund, the Canada ResearchChair program and the Canada Foundation for Innovation (to DPWH). JB is aUniversity of British Columbia Distinguished University Scholar and SJMJ is asenior scholar of the Michael Smith Foundation for Health Research. Thisresearch was enabled by the use of computing resources provided byWestGrid and Compute/Calcul Canada.Author details1Michael Smith Laboratories, University of British Columbia, 301-2185 EastMall, Vancouver, BC, Canada V6T 1A4. 2Canada’s Michael Smith GenomeSciences Centre, 570 W 7th Ave #100 Vancouver, BC, Canada V5Z 4S6.3Department of Biological Sciences, CW 405, Biological Sciences Bldg.,University of Alberta, Edmonton, AB, Canada T6G 2E9. 4Ecosystem Scienceand Management Program, University of Northern British Columbia, 3333University Way, Prince George, BC, Canada V2N 4Z9. 5Department of MedicalGenetics, University of British Columbia, University of British Columbia, 4500Oak St., Vancouver, BC, Canada V6H 3N1. 6Department of Molecular Biologyand Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby,BC, Canada V5A 1S6.Accepted: 27 March 2013 Published: 27 March 2013References1. Hammond PM: Species inventory. In Global Biodiversity, Status of the Earth’sLiving Resources. Edited by: Groombridge B. London: Chapman and Hall;1992:17-39.2. Richards S, Gibbs RA, Weinstock GM, Brown SJ, Denell R, Beeman RW,Gibbs R, Bucher G, Friedrich M, Grimmelikhuijzen CJ, Klingler M,Lorenzen M, Roth S, Schroder R, Tautz D, Zdobnov EM, Muzny D,Attaway T, Bell S, Buhay CJ, Chandrabose MN, Chavez D, Clerk-Blankenburg KP, Cree A, Dao M, Davis C, Chacko J, Dinh H, Dugan-Rocha S,Fowler G, et al: The genome of the model beetle and pest Triboliumcastaneum. Nature 2008, 452:949-955.3. Hunt T, Bergsten J, Levkanicova Z, Papadopoulou A, John OS, Wild R,Hammond PM, Ahrens D, Balke M, Caterino MS, Gomez-Zurita J, Ribera I,Barraclough TG, Bocakova M, Bocak L, Vogler AP: A comprehensivephylogeny of beetles reveals the evolutionary origins of asuperradiation. Science 2007, 318:1913-1916.4. Safranyik L, Carroll AL, Régnière J, Langor DW, Riel WG, Shore TL, Peter B,Cooke BJ, Nealis VG, Taylor SW: Potential for range expansion ofmountain pine beetle into the boreal forest of North America. TheCanadian Entomologist 2010, 142:415-442.5. de la Giroday H-MC, Carroll AL, Aukema BH: Breach of the northern RockyMountain geoclimatic barrier: initiation of range expansion by themountain pine beetle. Journal of Biogeography 2012, 39:1112-1123.6. Cullingham CI, Cooke JE, Dang S, Davis CS, Cooke BJ, Coltman DW:Mountain pine beetle host-range expansion threatens the boreal forest.Molecular Ecology 2011.7. Kurz WA, Dymond CC, Stinson G, Rampley GJ, Neilson ET, Carroll AL,Ebata T, Safranyik L: Mountain pine beetle and forest carbon feedback toclimate change. Nature 2008, 452:987-990.8. Wood SL: The Bark and Ambrosia Beetles of North and Central America(Coleoptera: Scolytidae), a Taxonomic Monograph Salt Lake City: BrighamYoung University; 1982.9. Sequeira AS, Normark BB, Farrell BD: Evolutionary assembly of the coniferfauna: distinguishing ancient from recent associations in bark beetles.Proceedings of The Royal Society of London Series B-Biological Sciences 2000,267:2359-2366.10. Reeve JD, Anderson FE, Kelley ST: Ancestral state reconstruction forDendroctonus bark beetles: evolution of a tree killer. EnvironmentalEntomology 2012, 41:723-730.11. Samarasekera GDNG, Bartell NV, Lindgren BS, Cooke JE, Davis CS, James PM,Coltman DW, Mock KE, Murray BW: Spatial genetic structure of themountain pine beetle (Dendroctonus ponderosae) outbreak in westernCanada: historical patterns and contemporary dispersal. Molecular Ecology2012, 21:2931-2948.12. Raffa KF, Aukema BH, Bentz BJ, Carroll AL, Hicke JA, Turner MG,Romme WH: Cross-scale drivers of natural disturbances prone toanthropogenic amplification: the dynamics of bark beetle eruptions.BioScience 2008, 58:501-517.13. Kurz WA, Stinson G, Rampley GJ, Dymond CC, Neilson ET: Risk of naturaldisturbances makes future contribution of Canada’s forests to the globalcarbon cycle highly uncertain. Proceedings of the National Academy ofSciences of the United States of America 2008, 105:1551-1555.14. Franceschi VR, Krokene P, Christiansen E, Krekling T: Anatomical andchemical defenses of conifer bark against bark beetles and other pests.New Phytologist 2005, 167:353-376.15. Keeling CI, Bohlmann J: Genes, enzymes and chemicals of terpenoiddiversity in the constitutive and induced defence of conifers againstinsects and pathogens. New Phytologist 2006, 170:657-675.16. Safranyik L, Carroll AL: The biology and epidemiology of the mountainpine beetle in lodgepole pine forests. In The mountain pine beetle: Asynthesis of biology, management, and impacts on lodgepole pine. Edited by:Safranyik L, Wilson B. Victoria, BC, Canada: Natural Resources Canada,Canadian Forest Service; 2006:3-66.17. Symonds MR, Elgar MA: The evolution of pheromone diversity. Trends inEcology and Evolution 2008, 23:220-228.18. DiGuistini S, Liao NY, Platt D, Robertson G, Seidel M, Chan SK, Docking TR,Birol I, Holt RA, Hirst M, Mardis E, Marra MA, Hamelin RC, Bohlmann J,Breuil C, Jones SJ: De novo genome sequence assembly of a filamentousfungus using Sanger, 454 and Illumina sequence data. Genome Biology2009, 10:R94.19. DiGuistini S, Wang Y, Liao NY, Taylor G, Tanguay P, Feau N, Henrissat B,Chan SK, Hesse-Orce U, Alamouti SM, Tsui CK, Docking RT, Levasseur A,Haridas S, Robertson G, Birol I, Holt RA, Marra MA, Hamelin RC, Hirst M,Jones SJ, Bohlmann J, Breuil C: Genome and transcriptome analyses ofthe mountain pine beetle-fungal symbiont Grosmannia clavigera, alodgepole pine pathogen. Proceedings of the National Academy of Sciencesof the United States of America 2011, 108:2504-2509.20. Hesse-Orce U, DiGuistini S, Keeling CI, Wang Y, Li M, Henderson H,Docking TR, Liao NY, Robertson G, Holt RA, Jones SJM, Bohlmann Jr,Breuil C: Gene discovery for the bark beetle-vectored fungal treepathogen Grosmannia clavigera. BMC Genomics 2010, 11:536.21. DiGuistini S, Ralph SG, Lim YW, Holt R, Jones S, Bohlmann J, Breuil C:Generation and annotation of lodgepole pine and oleoresin-inducedexpressed sequences from the blue-stain fungus Ophiostoma clavigerum,a mountain pine beetle-associated pathogen. FEMS Microbiology Letters2007, 267:151-158.22. Eigenheer AL, Keeling CI, Young S, Tittiger C: Comparison of generepresentation in midguts from two phytophagous insects, Bombyx moriand Ips pini, using expressed sequence tags. Gene 2003, 316:127-136.23. Aw T, Schlauch K, Keeling CI, Young S, Bearfield JC, Blomquist GJ, Tittiger C:Functional genomics of mountain pine beetle (Dendroctonusponderosae) midguts and fat bodies. BMC Genomics 2010, 11:215.24. Keeling CI, Henderson H, Li M, Yuen M, Clark EL, Fraser JD, Huber DPW,Liao NY, Docking TR, Birol I, Chan SK, Taylor GA, Palmquist D, Jones SJM,Bohlmann J: Transcriptome and full-length cDNA resources for themountain pine beetle, Dendroctonus ponderosae Hopkins, a major insectpest of pine forests. Insect Biochemistry and Molecular Biology 2012,42:525-536.25. Adams AS, Boone CK, Bohlmann J, Raffa KF: Responses of bark beetle-associated bacteria to host monoterpenes and their relationship toinsect life histories. Journal of Chemical Ecology 2011, 37:808-817.26. Muratoğlu H, Sezem K, Demirbağ Z: Determination and pathogenicity ofthe bacterial flora associated with the spruce bark beetle, IpsKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 17 of 19typographus (L.) (Coleoptera: Curculionidae: Scolytinae). Turkish Journal ofBiology 2011, 35:9-20.27. Morales-Jiménez J, Zúñiga G, Villa-Tanaca L, Hernández-Rodríguez C:Bacterial community and nitrogen fixation in the red turpentine beetle,Dendroctonus valens LeConte (Coleoptera: Curculionidae: Scolytinae).Microbial Ecology 2009, 58:879-891.28. Hu Y, Zhang W, Liang H, Liu L, Peng G, Pan Y, Yang X, Zheng B, Gao GF,Zhu B, Hu H: Whole-genome sequence of a multidrug-resistant clinicalisolate of Acinetobacter lwoffii. Journal of Bacteriology 2011, 193:5549-5550.29. Parra G, Bradnam K, Korf I: CEGMA: a pipeline to accurately annotate coregenes in eukaryotic genomes. Bioinformatics 2007, 23:1061-1067.30. Smith SG, Virkki N: Coleoptera. In Animal Cytogenetics. Volume 3. Berlin:Gebrüder Borntraeger; 1978:366.31. Lanier GN, Wood DL: Controlled mating, karyology, morphology, and sex-ratio in the Dendroctonus ponderosae complex. Annals of theEntomological Society of America 1968, 61:517-526.32. Lanier GN: Cytotaxonomy of Dendroctonus. In Application of Genetics andCytology in Insect Systematics and Evolution, Proceedings of the 1980 AnnualMeeting of the Entomological Society of America. Edited by: Stock MW.Moscow, Idaho, USA: Forest, Wildlife and Range Experimental Station,University of Idaho, Moscow; 1981:33-66.33. Zúñiga G, Cisneros R, Hayes JL, Macias-Samano J: Karyology, geographicdistribution, and origin of the genus Dendroctonus Erichson (Coleoptera:Scolytidae). Annals of the Entomological Society of America 2002,95:267-275.34. Wang S, Lorenzen MD, Beeman RW, Brown SJ: Analysis of repetitive DNAdistribution patterns in the Tribolium castaneum genome. GenomeBiology 2008, 9:R61.35. Winder RS, Macey DE, Cortese J: Dominant bacteria associated withbroods of mountain pine beetle, Dendroctonus ponderosae (Coleoptera:Curculionidae, Scolytinae). Journal of the Entomological Society of BritishColumbia 2010, 107:43-56.36. Vasanthakumar A, Italo Delalibera J, Handelsman J, Klepzig KD, Schloss PD,Raffa KF: Characterization of gut-associated bacteria in larvae and adultsof the southern pine beetle, Dendroctonus frontalis Zimmermann.Environmental Entomology 2006, 35:1710-1717.37. Morales-Jiménez J, Zúñiga G, Ramírez-Saad HC, Hernández-Rodríguez C:Gut-associated bacteria throughout the life cycle of the bark beetleDendroctonus rhizophagus Thomas and Bright (Curculionidae: Scolytinae)and their cellulolytic activities. Microbial Ecology 2012, 64:268-78.38. Andersson MN, Grosse-Wilde E, Keeling CI, Bengtsson JM, Yuen MMS, Li M,Hillbur Y, Bohlmann J, Hansson BS, Schlyter F: Antennal transcriptomeanalysis of the chemosensory gene families in the tree killing barkbeetles, Ips typographus and Dendroctonus ponderosae (Coleoptera:Curculionidae: Scolytinae). BMC Genomics 2013, 14:198.39. Kelley ST, Farrell BD: Is specialization a dead-end?: The phylogeny of hostuse in Dendroctonus bark beetles. Evolution 1998, 52:1731-1743.40. Sequeira AS, Farrell BD: Evolutionary origins of Gondwanan interactions:how old are Araucaria beetle herbivores? Biological Journal of the LinneanSociety 2001, 74:459-474.41. McKenna DD, Sequeira AS, Marvaldi AE, Farrell BD: Temporal lags andoverlap in the diversification of weevils and flowering plants. Proceedingsof the National Academy of Sciences of the United States of America 2009,106:7083-7088.42. Zhu B, Lou MM, Xie GL, Zhang GQ, Zhou XP, Li B, Jin GL: Horizontal genetransfer in silkworm, Bombyx mori. BMC Genomics 2011, 12:248.43. Thompson J, Robrish SA, Immel S, Lichtenthaler FW, Hall BG, Pikis A:Metabolism of sucrose and its five linkage-isomeric alpha-D-glucosyl-D-fructoses by Klebsiella pneumoniae. Participation and properties ofsucrose-6-phosphate hydrolase and phospho-alpha-glucosidase. Journalof Biological Chemistry 2001, 276:37415-37425.44. Acuña R, Padilla BE, Flórez-Ramos CP, Rubio JD, Herrera JC, Benavides P,Lee SJ, Yeats TH, Egan AN, Doyle JJ, Rose JK: Adaptive horizontal transferof a bacterial gene to an invasive insect pest of coffee. Proceedings of theNational Academy of Sciences of the United States of America 2012, doi:10.1073/pnas.1121190109.45. Choi JH, Kijimoto T, Snell-Rood E, Tae H, Yang Y, Moczek AP, Andrews J:Gene discovery in the horned beetle Onthophagus taurus. BMC Genomics2010, 11:703.46. Cornman SR, Schatz MC, Johnston SJ, Chen YP, Pettis J, Hunt G,Bourgeois L, Elsik C, Anderson D, Grozinger CM, Evans JD: Genomic surveyof the ectoparasitic mite Varroa destructor, a major pest of the honeybee Apis mellifera. BMC Genomics 2010, 11:602.47. Gompert Z, Forister ML, Fordyce JA, Nice CC, Williamson RJ, Buerkle CA:Bayesian analysis of molecular variance in pyrosequences quantifiespopulation genetic structure across the genome of Lycaeides butterflies.Molecular Ecology 2010, 19:2455-2473.48. Boone CK, Aukema BH, Bohlmann J, Carroll AL, Raffa KF: Efficacy of treedefense physiology varies with bark beetle population density: a basisfor positive feedback in eruptive species. Canadian Journal of ForestryResearch 2011, 41:1174-1188.49. Goodsman DW, Erbilgin N, Lieffers VJ: The impact of phloem nutrients onoverwintering mountain pine beetles and their fungal symbionts.Environmental Entomology 2012, 41:478-486.50. Pauchet Y, Wilkinson P, Chauhan R, Ffrench-Constant RH: Diversity ofbeetle genes encoding novel plant cell wall degrading enzymes. PLoSONE 2010, 5:e15635.51. Watanabe H, Tokuda G: Cellulolytic systems in insects. Annual Review ofEntomology 2010, 55:609-632.52. Feyereisen R: Evolution of insect P450. Biochemical Society Transactions2006, 34:1252-1255.53. Feyereisen R: Arthropod CYPomes illustrate the tempo and mode inP450 evolution. Biochimica et Biophysica Acta 2011, 1814:19-28.54. Iga M, Kataoka H: Recent studies on insect hormone metabolic pathwaysmediated by cytochrome P450 enzymes. Biological & PharmaceuticalBulletin 2012, 35:838-843.55. Sztal T, Chung H, Berger S, Currie PD, Batterham P, Daborn PJ: Acytochrome p450 conserved in insects is involved in cuticle formation.PLoS ONE 2012, 7:e36544.56. Sathyanarayanan S, Zheng X, Kumar S, Chen CH, Chen D, Hay B, Sehgal A:Identification of novel genes involved in light-dependent CRYdegradation through a genome-wide RNAi screen. Genes & Development2008, 22:1522-1533.57. Sandstrom P, Welch WH, Blomquist GJ, Tittiger C: Functional expression ofa bark beetle cytochrome P450 that hydroxylates myrcene to ipsdienol.Insect Biochemistry and Molecular Biology 2006, 36:835-845.58. Che-Mendoza A, Penilla RP, Rodríguez DA: Insecticide resistance andglutathione S-transferases in mosquitoes: A review. African Journal ofBiotechnology 2009, 8:1386-1397.59. Habig WH, Pabst MJ, Jakoby WB: Glutathione S-transferases. The firstenzymatic step in mercapturic acid formation. Journal of BiologicalChemistry 1974, 249:7130-7139.60. Atkins WM, Wang RW, Bird AW, Newton DJ, Lu AY: The catalyticmechanism of glutathione S-transferase (GST). Spectroscopicdetermination of the pKa of Tyr-9 in rat alpha 1-1 GST. Journal ofBiological Chemistry 1993, 268:19188-19191.61. Sheehan D, Meade G, Foley VM, Dowd CA: Structure, function andevolution of glutathione transferases: implications for classification ofnon-mammalian members of an ancient enzyme superfamily. TheBiochemical Journal 2001, 360:1-16.62. Enayati AA, Ranson H, Hemingway J: Insect glutathione transferases andinsecticide resistance. Insect Molecular Biology 2005, 14:3-8.63. Friedman R: Genomic organization of the glutathione S-transferasefamily in insects. Molecular Phylogenetics and Evolution 2011, 61:924-932.64. Oppert C, Klingeman WE, Willis JD, Oppert B, Jurat-Fuentes JL: Prospectingfor cellulolytic activity in insect digestive fluids. Comparative Biochemistryand Physiology Part B, Biochemistry & Molecular Biology 2010, 155:145-154.65. Robinson GE, Hackett KJ, Purcell-Miramontes M, Brown SJ, Evans JD,Goldsmith MR, Lawson D, Okamuro J, Robertson HM, Schneider DJ:Creating a buzz about insect genomes. Science 2011, 331:1386.66. Davis CS, Mock KE, Bentz BJ, Bromilow SM, Bartell NV, Murray BW, Roe AD,Cooke JEK: Isolation and characterization of 16 microsatellite loci in themountain pine beetle, Dendroctonus ponderosae Hopkins (Coleoptera:Curculionidae: Scolytinae). Molecular Ecology Resources 2009, 9:1071-1073.67. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I: ABySS: aparallel assembler for short read sequence data. Genome Research 2009,19:1117-1123.68. Canada’s Michael Smith Genome Sciences Centre: Anchor: Post-processingtools for de novo assemblies. [http://www.bcgsc.ca/platform/bioinfo/software/anchor].69. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, SanchezAlvarado A, Yandell M: MAKER: an easy-to-use annotation pipelineKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 18 of 19designed for emerging model organism genomes. Genome Research2008, 18:188-196.70. Institute for Systems Biology: RepeatMasker. [http://www.repeatmasker.org/].71. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J:Repbase Update, a database of eukaryotic repetitive elements. Cytogenticand Genome Research 2005, 110:462-467.72. Stanke M, Tzvetkova A, Morgenstern B: AUGUSTUS at EGASP: using EST,protein and genomic alignments for improved gene prediction in thehuman genome. Genome Biology 2006, , 7 Suppl 1: S11 11-18.73. Korf I: Gene finding in novel genomes. BMC Bioinformatics 2004, 5:59.74. Lukashin AV, Borodovsky M: GeneMark.hmm: new solutions for genefinding. Nucleic Acids Research 1998, 26:1107-1115.75. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W,Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of proteindatabase search programs. Nucleic Acids Research 1997, 25:3389-3402.76. Slater GS, Birney E: Automated generation of heuristics for biologicalsequence comparison. BMC Bioinformatics 2005, 6:31.77. Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P,Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N,Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M,Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D,Orengo C, Quinn AF, et al: InterPro: the integrative protein signaturedatabase. Nucleic Acids Research 2009, 37:D211-215.78. Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25:1754-1760.79. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,Abecasis G, Durbin R: The Sequence Alignment/Map format andSAMtools. Bioinformatics 2009, 25:2078-2079.80. The R Project for Statistical Computing. [http://www.r-project.org].81. K-State Bioinformatics Center: BeetleBase: Tribolium castaneum. [http://beetlebase.org].82. Price AL, Jones NC, Pevzner PA: De novo identification of repeat familiesin large genomes. Bioinformatics 2005, , 21 Suppl 1: i351-358.83. Chen F, Mackey AJ, Stoeckert CJ Jr, Roos DS: OrthoMCL-DB: querying acomprehensive multi-species collection of ortholog groups. Nucleic AcidsResearch 2006, 34:D363-368.84. Li L, Stoeckert CJ Jr, Roos DS: OrthoMCL: identification of ortholog groupsfor eukaryotic genomes. Genome Research 2003, 13:2178-2189.85. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy andhigh throughput. Nucleic Acids Research 2004, 32:1792-1797.86. Price MN, Dehal PS, Arkin AP: FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS ONE 2010, 5:e9490.87. Letunic I, Bork P: Interactive Tree Of Life v2: online annotation anddisplay of phylogenetic trees made easy. Nucleic Acids Research 2011, 39:W475-478.doi:10.1186/gb-2013-14-3-r27Cite this article as: Keeling et al.: Draft genome of the mountain pinebeetle, Dendroctonus ponderosae Hopkins, a major forest pest. GenomeBiology 2013 14:R27.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitKeeling et al. Genome Biology 2013, 14:R27http://genomebiology.com/content/14/3/R27Page 19 of 19


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items