UBC Faculty Research and Publications

Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F Sep 27, 2007

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12915_2007_Article_137.pdf [ 1.27MB ]
JSON: 52383-1.0215989.json
JSON-LD: 52383-1.0215989-ld.json
RDF/XML (Pretty): 52383-1.0215989-rdf.xml
RDF/JSON: 52383-1.0215989-rdf.json
Turtle: 52383-1.0215989-turtle.txt
N-Triples: 52383-1.0215989-rdf-ntriples.txt
Original Record: 52383-1.0215989-source.json
Full Text

Full Text

ralssBioMed CentBMC BiologyOpen AcceResearch articleBroad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondriaChristopher J Jackson1, John E Norman2, Murray N Schnare2, Michael W Gray2, Patrick J Keeling3 and Ross F Waller*1Address: 1School of Botany, the University of Melbourne, Victoria 3010, Australia, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 1X5, Canada and 3Department of Botany, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, CanadaEmail: Christopher J Jackson - c.jackson4@pgrad.unimelb.edu.au; John E Norman - John.Norman@gowlings.com; Murray N Schnare - mschnare@rsu.biochem.dal.ca; Michael W Gray - m.w.gray@dal.ca; Patrick J Keeling - pkeeling@interchange.ubc.ca; Ross F Waller* - r.waller@unimelb.edu.au* Corresponding author    AbstractBackground: Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylumthat is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome ofapicomplexans is uniquely reduced in gene content and size, encoding only three proteins and tworibosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrialgenomes have been comparatively poorly studied: limited available data suggest some similaritieswith apicomplexan mitochondrial genomes but an even more radical type of genomic organization.Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes.Results: From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generatedover 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that ofmitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eightconserved components of the highly fragmented large and small subunit rRNAs. Unlike inapicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as genefragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspectsof dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standardstart codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in theabsence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-splicedto generate full-length mRNAs. RNA substitutional editing, a process previously identified formRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression.Conclusion: The dinoflagellate mitochondrial genome shares the same gene complement andfragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits severalunique characteristics. Most notable are the expansion of gene copy numbers and theirarrangements within the genome, RNA editing, loss of stop codons, and use of trans-splicing.Published: 27 September 2007BMC Biology 2007, 5:41 doi:10.1186/1741-7007-5-41Received: 17 April 2007Accepted: 27 September 2007This article is available from: http://www.biomedcentral.com/1741-7007/5/41© 2007 Jackson et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Page 1 of 17(page number not for citation purposes)BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41BackgroundThe origin of mitochondria by endosymbiosis hasemerged as a pivotal event in the evolution of eukaryotes.All eukaryote groups that have been studied bear a deriv-ative of this endosymbiont, and for most the resultingmitochondrion is central to energy metabolism as well asproviding several other anabolic and catabolic functions[1]. A relict, though functionally essential, mitochondrialgenome (or mtDNA) persists in all but a few anaerobiceukaryotes, and the genes in these genomes firmly iden-tify the original endosymbiont as an α-proteobacterium[2]. The jakobid flagellate Reclinomonas americana has theleast derived mitochondrial genome characterized to date,with at least 97 genes encoded on a single, circular-map-ping 69 kb chromosome [3]. More typically mitochon-drial genomes have been reduced to 40–50 genesarranged on either circular- or linear-mapping chromo-somes of 15–60 kb (although many plant mitochondrialgenomes have been secondarily expanded to several hun-dreds to thousands of kb) [4].In some eukaryotic groups, however, the mtDNA has beenmodified more substantially, resulting in extremes ingenome structure. For example, trypanosomatid mtDNAconsists of a few dozen large circular molecules and sev-eral thousand minicircles that encode guide RNAs thatparticipate in extensive U insertion/deletion RNA editing[5]. Diplonemid mitochondria also contain multiple cir-cular mtDNA molecules, each encoding gene fragmentsthat are trans-spliced to generate functional transcripts[6]. Another example is the mtDNA in the ichthyospo-rean, Amoebidium parasiticum: in this case, mitochondrialgenes are fragmented and dispersed over several hundredlinear chromosomes, totaling > 200 kb [7]. Over thediversity of eukaryotes, mitochondrial genomes exhibitother interesting characteristics, including the use of anumber of different non-standard genetic codes, many ofwhich involve alterations in start and, more rarely, stopcodons [8,9].One large group in which particularly interesting mito-chondrial genome variation has been found is alveolates.Three major phyla make up alveolates: ciliates, apicompl-exans, and dinoflagellates, with apicomplexans and dino-flagellates being sister clades to the exclusion of ciliates[10,11]. Within alveolates, ciliate mtDNA is the most con-ventional, consisting of a linear molecule, 40–50 kb inlength, that codes for many of the standard mitochondrialproteins found in other organisms [12]. By contrast, themtDNA of the apicomplexan genus Plasmodium is thesmallest known, consisting of a linear, 6 kb tandem repeat[13] with only three protein-coding genes: cytochromeoxidase subunit 1 (cox1), cytochrome oxidase subunit 3sponding apicomplexan genes are fragmented to anunprecedented degree and scattered about the genome[13,14].To date, dinoflagellate mtDNAs have been the least wellstudied of alveolate mitochondrial genomes, with existingdata pointing to a genome exhibiting several eccentrici-ties. The first sequences isolated were four copies of cox1from Crypthecodinium cohnii, each of which was found tooccur in a unique genomic context [15]. Southern blotsdemonstrated multiple different copies of this gene thatvaried in abundance, suggesting the C. cohnii mitochon-drial genome is not as streamlined as in apicomplexans.Subsequently, cob and cox3 have been found as well, andmultiple, sometimes fragmented copies of these geneshave now been reported from diverse dinoflagellates(Gonyaulax polyedra, Pfiesteria piscicida, Alexandrium cat-enella) [16-18]. Most unexpected, however, was the dem-onstration that protein-coding transcripts are heavilyedited at the RNA level in diverse dinoflagellates [18,19],unlike the case in either apicomplexans or ciliates.To gain greater insight into the nature of dinoflagellatemitochondrial genomes, we have generated a large bodyof mitochondrial genomic and transcriptional data fortwo distantly related dinoflagellate species, C. cohnii andKarlodinium micrum. These data encompass more than 30mtDNA fragments totaling > 42 kb, and more than 50mitochondrial transcripts. This new information high-lights several novel features of the organization andexpression of the dinoflagellate mitochondrial genome,and concurrent studies in two additional distantly relateddinoflagellates, Amphidinium carterae [20] and Oxyrrhismarina [21], corroborate a number of our findings.Together, these data reinforce the conclusion that thedinoflagellate mitochondrial genome has been substan-tially reorganized since the divergence of dinoflagellatesand apicomplexans from a common ancestor.ResultsGenomic sequence reveals a complex mitochondrial genomeCrypthecodinium cohniiPreviously reported C. cohnii cox1 sequences indicatedmultiple copies of the gene with different flankingsequences [15]. To test if this genomic complexity extendsto other C. cohnii mitochondrial genes, we sequencedmultiple genomic clones containing cob and/or cox3. Alibrary of EcoRI restriction fragments constructed from afraction enriched in mtDNA was screened using a C. cohniicob gene probe, obtained by PCR. This screen recovered acob clone linked to a 57-bp cox3 fragment, which itself wasused to probe for cox3-containing clones. In total, 14Page 2 of 17(page number not for citation purposes)(cox3) and cytochrome b (cob). In addition, ciliate mtDNAencodes two ribosomal RNAs (rRNAs), but the corre-clones were characterized (11 cob, two cox3 and one con-taining both), ranging in size from 2.5 kb to 5.4 kb (eightBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41clones were 3.7 kb long). End sequencing and restrictionmapping identified six unique cob-containing clones, andthree unique cox3-containing clones. Four clones werecompletely sequenced (Figure 1).The largest clone, pc3#2.2 (5.4 kb), contains a completeor nearly complete cob gene (see below), followed bythree other identifiable sequences: a 49-bp stretch identi-cal to a sequence previously found in a cox1-containingclone [15]; a 113-bp cox3 segment; and a 99-bp large sub-unit (LSU) rRNA sequence corresponding to mitochon-drial LSUG in apicomplexans [14]. Two additional cobclones were sequenced, pcb#7 (3.7 kb) and pcb#2 (3.2kb). Both encode cob, but with different flankingsequences than in pc3#2.2. pcb#2 contains unique 3'sequence immediately after the cob repeat, whereas pcb#7contains additional common sequence with pc3#2.2 for~1 kb before unique sequence occurs (Figure 1). Amongstthese clones, we observed two different 5'-flankingsequences and three different 3'-flanking sequences (Fig-ure 1). This arrangement recapitulates the organization ofcox1 in C. cohnii mtDNA [15], i.e., a central repeat (1072bp) containing most of the cob ORF) flanked by differentarrays of unique upstream and downstream sequences.Partial sequencing of the remaining clones revealed anadditional unique 5'-flanking sequence (in pcb#8) andone additional unique 3'-flanking sequence (in pcb#4and pcb#9) in the immediate vicinity of the cob ORF (datanot shown).Of the three cob-containing clones described above, onlypcb#2 encodes a complete cytochrome b (Cob) protein(see below). pc3#2.2 and pcb#7 share an alternative 3'sequence that predicts a Cob C-terminal sequence lacking24 amino acid residues compared with the pcb#2-pre-dicted Cob as well as the corresponding Plasmodium falci-parum Cob. This suggests that the pc3#2.2 and pcb#7 CobORFs represent pseudogenes. Variable 3' codingsequences were also seen previously for C. cohnii cox1,with some coding sequences also truncated compared toother dinoflagellate sequences [15].Schematic of C. cohnii mtDNA fragmentsFigure 1Schematic of C. cohnii mtDNA fragments. Mitochondrial sequences are drawn to scale, with coding sequence on either the forward or reverse strand indicated above or below the line, respectively. Colored blocks indicate protein-coding genes and hatched boxes denote rRNA genes. Coding sequence is identified by sequence similarity to gene homologues irrespective of standard start and stop codons. Common sequence between fragments (> 99% identity) is indicated by horizontal dashed lines and matching lowercase letters. Black boxes indicate locations and sizes of Southern blot probes. Large inverted repeats (> 9) are indicated by black dot pairs above and below each sequence, and short proximal inverted repeats (> 6) are indicated by paired vertical dashes. Minor differences of inverted repeat distribution between common sequence (dashed lines) are due Page 3 of 17(page number not for citation purposes)to the minor sequence differences.BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41One cox3-containing clone (pc3#5) was also sequenced,but it was found not to encode an intact cox3 gene.Instead, this clone encoded 1339 bp identical in sequenceto the portion of pc3#2.2 that included the 113-bp cox3segment and the 49-bp cox1 sequence (Figure 1). Thisclone was also flanked by unique sequences, providingfurther evidence that mitochondrial genes occur in multi-ple genomic contexts in C. cohnii.To further investigate the arrangements and relative num-bers of mtDNA elements, Southern hybridization analysiswas performed using region-specific probes. As shown inFigure 1, probes were generated specific to: the cob codingsequence ('cob'); two cob 3'-flanking regions ('cb1', spe-cific to pc3#2.2 and pcb#7; and 'cb3', specific to pcb#2);the cox3 sequence ('cox3'); and the rRNA sequence LSUG('rnl'). These probes were hybridized against a mtDNA-enriched fraction hydrolyzed by EcoRI. With the 'cob'probe, a strong signal was detected at 3.7 kb and weakersignals at 4.8, 4.5, 3.5, and 3.0 kb (Figure 2). This result isconsistent with dominant EcoRI clones being 3.7 kb, andwith multiple genomic contexts for cob. Probing with 3'flanking sequence 'cb1' revealed a similar banding patternto that generated by the 'cob' probe, indicating that thisregion is typically contiguous with the cob codingsequence. Probing with 'cb3' presented a very differentprofile, with 10 bands ranging in size from 3.7 to 0.5 kband of varying intensity (Figure 2). The cb3 sequence evi-dently occurs in numerous EcoRI fragments, some withoutcob. Probing with 'cox3' and 'rnl' also revealed multiplebands with varying intensity (Figure 2), again indicatingthat these mtDNA elements are present in several differentgenomic arrangements. Together these Southern data ver-ify the existence of multiple copies of C. cohnii mtDNAelements occurring in different contexts, and indicate thatup to 10 different arrangements occur for some of theseelements.Karlodinium micrumPutative mitochondrial genes were identified from a sur-vey of 16544 K. micrum expressed sequence tag (EST)sequences assembled into 11903 unique clusters [22].Oligoadenylation of mitochondrial gene transcripts isknown from other organisms [23,24], and this alsoappears to be the case in dinoflagellates as the poly(A)-dependent K. micrum survey also contained many cDNAsfor mitochondrial genes. Mitochondrial sequences wereidentified by homology to genes in other systems, and allsuch cDNAs were fully sequenced. Using this strategy weidentified sequences representing the three protein-encoding genes found in C. cohnii: cox1 (1 cDNA), cob (11cDNAs) and cox3 (9 cDNAs). The average A+T content ofthese sequences was 69% (compared to 49% for nuclearWe found no other mitochondrial protein-codingsequences exhibiting the strong A+T biases suggestive ofan origin from mtDNA (cox2 coding sequence, for exam-ple, which is typically encoded in mitochondria but isknown to have been transferred to the nucleus in dino-flagellates [25], contains 47% A+T). Several short cDNAsequences, however, with high similarity to the frag-mented apicomplexan mitochondrial rRNAs [14] (seealso GenBank acc. no. M76611 for updated annotation)were identified. These correspond to apicomplexan LSUrRNA fragments LSUA, RNA2, LSUE, LSUG and RNA10(3, 1, 3, 1, and 9 cDNAs, respectively), small subunit(SSU) rRNA fragment RNA8 (9 cDNAs), and an RNA(RNA7, 7 cDNAs) that has yet to be assigned to either theLSU or SSU rRNA. While these sequences have a lesser A+Tbias (56%) compared with the mitochondrial protein-encoding sequences, the high similarity of theseSouthern blot analysis of C. cohnii mtDNA with 32P-labelled probes specific for mit chondrial gene and flanking regionsFigur  2Southern blot analysis of C. cohnii mtDNA with 32P-labelled probes specific for mitochondrial gene and flanking regions. A fraction enriched in mtDNA was either untreated ('U') or EcoRI hydrolysed ('E') and the products separated by gel electrophoresis. Blots were hybridized with probes specific for cob, cob-flanking sequences ('cb1' and 'cb3'), cox3 or LSUG ('rnl') (see Figure 1 for probe locations). Size markers are indicated to the left in kb pairs.Page 4 of 17(page number not for citation purposes)genes, calculated from all 11903 K. micrum clusters), con-sistent with their being encoded in the mitochondrion.sequences to their apicomplexan counterparts (seebelow), and known oligoadenylation of these transcriptsBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41in apicomplexans [23,24], strongly implicates thesesequences as additional elements of the K. micrummtDNA.With these 10 mtDNA tags, we used PCR to generategenomic sequences corresponding to each gene andregions linking them, with the aim of assembling largeportions of K. micrum mtDNA sequence. Intergenicsequence recovered by this approach was used to providefurther priming sites to extend the sampling of K. micrummtDNA. In addition to amplification of individual genes,a total of 20 distinct gene linkage products were generatedand fully sequenced (Figure 3B). This analysis yielded asequence in which mitochondrial genes were linked toone another in many different contexts. Gene fragmentswere also common, as were mtDNAs with three or fourdistinct fragments or tandem repeats (Figure 3B). In total,cob sequences were found in at least six mutually exclusivelinkages, cox3 in five, cox1 in four, LSUE in nine, RNA10in six, RNA2 in five and RNA7 in one. Additionally, twolarge cDNAs (GenBank accession EF443051, 5 854 bp;and EF443052, 2153 bp) provided further evidence ofmultiple copies of mitochondrial genes and gene frag-ments linked in novel arrangements. EF443051, for exam-ple, contains the LSUG coding sequence, a second partialLSUG unit within a 170-bp repeat, the LSUA sequence,the RNA8 sequence, and an internal fragment of the cox1gene (73 bp). These cDNAs also indicate that polycis-tronic transcription occurs in dinoflagellate mitochon-dria.Intergenic sequences from the PCR clones were examinedfor additional coding elements by comparison to publiclyavailable databases, specifically searching against K.micrum ESTs as well as comparing the intergenic regions toone another. No identifiable genes were found, but onecDNA sequence (GenBank accession EF443049) was rep-resented in one mtDNA clone, implicating this sequenceas an additional transcriptional unit of the mitochondrialgenome (Figure 3B, xvi). Comparison of intergenicsequences to one another revealed numerous dispersedrepeated sequences with either 100% or very high degreesof identity (Figure 3B, dashed lines). Overall, data from K.micrum are consistent with those from C. cohnii, bothpointing to a complex genome organization evidentlyunderpinned by a high level of recombination withindinoflagellate mitochondria.Inverted repeats in mtDNAPrevious analysis of C. cohnii cox1 identified many shortinverted repeats in flanking, non-coding sequences [15].We have applied a similar analysis to the C. cohnii cob- andcox3-containing sequences, as well as the K. micrumtwo taxa. Within the C. cohnii sequences, we screened forinverted repeats of different length and distance betweenthem, and found two distinct but prevalent classes of thiselement type. The first class is similar to those previouslydescribed [15], and consists of very closely spaced, smallinverted repeats (> 6 nucleotides and no more than 5nucleotides apart). These inverted repeats occur almostexclusively within non-coding sequence, with the onlyexceptions being at the very extremities of genes (Figure 1,vertical dashes). A second class of inverted repeats consistsof longer repeat elements (> 9 nucleotides) no more than50 nucleotides apart. Such inverted repeats are also preva-lent in C. cohnii mtDNA, and are almost exclusively fea-tures of the non-coding sequences (Figure 1, smallcircles).Analysis of K. micrum mtDNA showed that invertedrepeats are also a feature of intergenic sequences; how-ever, in this case only the larger class of inverted repeatswas found, with none of the smaller, closely spacedinverted repeats occurring in any of the mtDNA sequences(Figure 3). Again these repeats are almost exclusivelylocated within intergenic regions, with genic invertedrepeats only occasionally present, within gene extremities.No equivalent inverted repeats were found in a randomsample of 10 K. micrum nucleus-encoded gene sequences(10630 nucleotides total). The sequences of repeated ele-ments in both C. cohnii and K. micrum are consistent withsecondary structures such as stem loops and hairpins, andin both cases the repeated elements that could form suchstem structures are typically G+C rich, in spite of the A+Tbias of these organelle genomes. The inverted repeatsdescribed here are also distinct from secondary structuralelements of the rRNAs (see below) that typically consist ofimperfect inverted repeats. Densely packed invertedrepeats, primarily in intergenic regions, was also recentlydescribed from A. carterae mtDNA [20]. In this case,imperfect inverted repeats were predicted to form stems of50–150 nucleotides, with AT-rich loops of ~10–30 nucle-otides. While inverted repeats therefore appear to be aconsistent feature of dinoflagellate mitochondrialgenomes, the elaboration of these elements is variablebetween taxa, with shorter repeats only present in C. coh-nii.Mitochondrial gene transcripts lack stop and start codonsExtensive substitutional RNA editing of transcripts occursin dinoflagellate mitochondria, so exactly where an openreading frame begins and ends can only be tentativelyinferred from genomic DNA. Accordingly we used K.micrum cDNAs, and publicly available mRNA sequencesfrom several other dinoflagellates, to identify the ends ofall three protein-coding genes.Page 5 of 17(page number not for citation purposes)mtDNA data, and find a very similar pattern of repeat fea-tures, although we also note some differences between theBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41Page 6 of 17(page number not for citation purposes)Schematic of K. micrum mitochondrial cDNAs (A) and 20 mtDNA fragments generated by PCR (B)Figure 3Schematic of K. micrum mitochondrial cDNAs (A) and 20 mtDNA fragments generated by PCR (B). Gene sequences in (A) correspond to the longest cDNA data generated for each gene (see also Figure 4). Mitochondrial sequences are drawn to scale, with coding sequence in (B) on either the forward or reverse strand indicated above or below the line, respectively. Colored blocks indicate protein-coding genes, textured black boxes indicate rRNA genes. cDNA lengths (in nucleotides (nt)) are indicated in (A), and corresponding nucleotide matches in PCR fragments are accordingly indicated in (B). Common intergenic sequences (> 99% identity) between PCR fragments are indicated by dashed lines and matching lowercase letters. The letter 'g' indicates matching sequence to unidentified cDNA EF443049. Inverted repeats are indicated by black dot pairs above and below each sequence.BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41Absence of stop codonsOligoadenylation of transcripts apparently occursupstream of any canonical stop codon in all protein-encoding transcripts analyzed, and for only one gene doesoligoadenylation create an in-frame canonical stopcodon. This lack of encoded stop codons applies to tran-scripts for cob, cox3 and cox1 represented from multiplespecies. All 11 cob transcripts from K. micrum are oligoad-enylated at the same point, which corresponds to theexpected C-terminus of Cob homologues (Figure 4), butdoes not include an in-frame stop. The 3' ends of tran-scripts from four other dinoflagellates (P. piscicida, Proro-centrum minimum, G. polyedra, A. carterae) areoligoadenylated at precisely the same position (Figure 4).For cox1, the mRNA sequences from four taxa (P. mini-mum, P. piscicida, A. carterae, and Karenia brevis) are all oli-goadenylated at the same position, where the proteinsequence is predicted to terminate (Figure 4); once again,none of these encode a stop codon.The K. micrum cox3 cDNAs present an even more interest-ing situation. Five of nine cDNAs are oligoadenylatedapproximately 40 codons upstream of the predicted C-ter-minus, and without an in-frame stop codon (Figure 4).However, another four cDNAs are oligoadenylated a fur-ther 129 nucleotides downstream; these cDNAs encodeamino acid sequence with high similarity to the C-termi-nus of Cox3. In this case, oligoadenylation follows a Uresidue creating an in-frame UAA stop codon. The genera-tion of an in-frame stop codon concomitant with oligoad-enylation is also apparent in Amphidinium cox3 mRNA;however, as in K. micrum, other cox3 Amphidinium tran-scripts are oligoadenylated prematurely, within a fewbases of the premature oligoadenylation site in K. micrumcDNAs (Figure 4). Alternative oligoadenylation sites havealso been reported for cox3 transcripts in the dinoflagel-late G. polyedra [16].A potential alternative stop codon was sought amongthese transcript data by looking for a codon that occursexclusively in the 3' region of these coding sequences.However, no such candidate codon could be identifiedeither within or between the taxa surveyed, nor is thereany evidence for use of a non-standard genetic code (withthe possible exception of start codons, see below). More-over, oligoadenylation consistently occurred at the posi-tion where the protein sequence is expected to terminate,leaving little or no apparent untranslated region (UTR).Alternative start codonsDependence on a standard ATG start codon also is appar-ently relaxed in dinoflagellate mitochondria. From multi-ple dinoflagellate species mRNAs for the three protein-lack a plausible N-terminal AUG (Figure 4). Existinggenomic sequences corroborate the lack of initiatingATGs.Transcript data for cox3 from three species (K. brevis, K.micrum and G. polyedra) and cox1 from K. micrum are allapparently full length based on protein alignments and alllack an AUG in the terminal region (Figure 4). The corre-sponding genomic region upstream of K. micrum cox1does not contain an in-frame ATG until 615 nucleotidesupstream of the conserved sequence, and 11 stop codonsfall between them, supporting the likely absence of anATG from this gene. Genomic sequences for C. cohnii cox1,however, do contain an in-frame ATG ~13 codonsupstream of N-terminal sequence conservation seenamong dinoflagellates. While it is possible that this partic-ular ATG serves as the initiator codon in this taxon, thelack of any sequence conservation with the correspondingK. micrum sequence within this 13-residue stretch (Figure4) suggests that this might also represent a chance ATGwithin the 5' UTR.K. micrum cob mRNAs do encode an AUG close to the sitewhere sequence conservation with other Cob proteinsbegins, but on close inspection there is conservedsequence upstream of this codon (Figure 4). Further, cobfrom the early-diverging member of the dinoflagellates,Oxyrrhis marina, lacks this AUG or any other upstream ofthis region [21]. In mRNAs of all other available species(K. micrum, K. brevis, and P. piscicida) there is strong con-servation of the four predicted amino acid residuesupstream of this ATG (F, V/L, L, L), further suggesting thattranslation likely initiates upstream of it (Figure 4). Theconservative change of this second residue, V to L, amongdinoflagellate taxa (and V to I in the genomic sequence forC. cohnii) supports the inference that this region likely rep-resents protein-coding sequence rather than UTR. Someconservation of this sequence with Plasmodium Cob is alsoapparent (Figure 4). None of the four apparently full-length K. micrum cob genomic sequences encodes an addi-tional ATG codon between this region of conservationand the next in-frame stop codon (Figure 4), and the samesituation is seen in a P. piscicida cob sequence. The C. cohniigenomic sequences are the only cases to date where poten-tial ATG codons do occur in this upstream sequence (Fig-ure 4). However, two of these occur well upstream of any5'-sequence conservation among dinoflagellates, andwould represent unusually long (5'-extended) and diver-gent Cob proteins in these cases (Figure 4).Trans-splicing of cox3Included among the K. micrum cox3 cDNAs were fourinferred to be full length (839 nucleotides) based on pro-Page 7 of 17(page number not for citation purposes)coding genes extend beyond conserved N-termini, sug-gesting these transcripts are likely to be full length, but alltein alignments (Figure 4), and five inferred to be prema-turely oligoadenylated at nucleotide 712. Despite the factBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41Page 8 of 17(page number not for citation purposes)Absence of conventional stop and start codons represented in protein alignments of dinoflagellate Cob, Cox3 and Cox1Figure 4Absence of conventional stop and start codons represented in protein alignments of dinoflagellate Cob, Cox3 and Cox1. Predicted amino acid sequence termini represent (A) 3' and (B) 5' sequences from cDNA and gDNA. Blue sequence indicates conceptual translation of 3' oligo(A) tract of mRNAs. Identical and similar residues are indicated by black or grey backgrounds, respectively. Inferred differences between cDNA and gDNA sequences of the same taxa correspond to RNA editing changes. Only longer cox3 mRNAs (mRNAL) encode an in-frame stop codon, generated by oligoadenylation fol-lowing a terminal U. The 5' sequence termini represent either the limit of reverse transcription of mRNAs, or inferred transla-tions of 5' genomic coding sequence (gDNA). Cob 3' sequence 'C. coh gDNA1' corresponds to clone pcb#2, while 'C. coh gDNA2' corresponds to clones pc3#2.2 and pcb#7. Underlined K in 'K. mic mRNA' (B, Cox1 5') indicates the site of a 10-nt deletion relative to 'K. mic gDNA'. Underlined Ms (B, Cob 5' and Cox1 5') indicate possible initiation codons found in-frame, but upstream of conserved sequence. Non-dinoflagellate homologues included for comparison of protein termini are: P. fal, Plasmodium falciparum M76611; C. mer, Cyanidioschyzon merolae, BAA34657; R. ame, Reclinomonas americana, AAD11871; N. oli, Nephroselmis olivacea, AAF03208; H. sap, Homo sapiens, AAZ02899. Dinoflagellate taxa and accession numbers: K. mic, Kar-lodinium micrum, this study; C. coh, Crypthecodinium cohnii, this study; L. pol, Lingulodinium polyedrum, CD810189, CD810189; G. pol, Gonyaulax polyedra, AF142470; P. pis, Pfiesteria piscicida, AF357518, AF463413, AF357518, AF357521; K. bre, Karenia brevis, CO062170, CO065693, CO062289, CO060561; A. car, Amphidinium carterae, CF064846, CF065669, CF064811, CF067165; P. mic, Prorocentrum minimum, AY030285, AF463415.ACob 3’L. pol mRNA -FFLSFLSFLWIGAQFPVEKFLSYARILTLHYYFLLM--CILFSKKKKKnP. pis mRNA -LFSLSLSFLWIGYQFPQEKFLSYARILTLYYYFLLM--CILFSKKKKKnP. min mRNA -FFLLVLSFLWIGAQFPQEKFLSYARILTLYYYFLLM--CILFRKKKKKnA. car mRNA -FSLSFLSLIYIGGQIPHSTFISYIRLLTINYYFLII--SILILKKKKKnK. mic mRNA -FFLSFLSCLWIGAQFPQEKFLSYARILTLDFYFLLI--CISFSKKKKKnK. mic gDNA -FFLSLLSCLWIGAQFPQEKFLSYGRILTLDFYFLLI—-CISFSFYLLFLYAVAHPVNGSSKGFRFIIS.C. coh gDNA1 -FFSIYICFIWIGAQLPQEMFISYGRILTLHYYFLIILYLLPLEISVCCCQRIIG.C. coh gDNA2 -FFSIYICFIWIGAQLPQEMFISLSKSYKQW.P. fal gDNA -FMCAFYALLWIGCQLPQDIFILYGRLFIVLFFCSGLFVLVHYRRTHYDYSSQANI.Cox3 3’A. car mRNAS -LHFFHLIIGLLLLSLLFWSCNYLSNRKKKKKnA. car mRNAL -LHFFHLIVGLLLLSLLFWGCSYLSNLDKYVCFRSSEVHLFFACSL-----FYWHFVEVLWLFILLGIYFN.KKKKnK. mic mRNAS -LHFFHLVVGLFLLSLFFWGCCFPTKKKKKnK. mic mRNAL -LHFFHLVVGLFLLSLFFWGCCFPTKIVWFLNLRVSEVHLFYNLQN-----FYWHFLEILWLFIFLFLYSL.KKKKnP. fal gDNAL -LHFSHVVIGLLLLIIYFIRIIEIYDTSTEWFINSFGISYIVIPHTDQITILYWHFVEIVWLYIEFLFYSE.Cox1 3’P. pis mRNA -LTFVGILLTFSPMHFLGFNVMPRRIPDFPDSFHSWNFLSSIGSGITLLSFGFLKKKKKnP. min mRNA -LTFVGILLTFSPMHFLGFNVLPRRIPDFPDSFHSWNFLSSIGSGITLLSFAILKKKKKnK. bre mRNA -LTLVGILLTFSPMHFLGFNVMPRRIPDFPDSFHSWNFLSSIGSGITLLSFAMLKKKKKnA. car mRNA -LVFIGIILTFIPIHFLGFNLMPRRIQDFPDSFHSWNFLSSIGSGITLLSFTMLKKKKKnK. mic gDNA -STFIGILLTFSPMHFLGFNVMPRRIPDFPDTFHSWNFLSSIGSGITFLSFGMLTGNPDDIFTAAVRRLVLR.P. fal gDNA -LFFVGVILTFLPMHFLGFNVMPRRIPDYPDALNGWNMICSIGSTMTLFGLLIFK.BCob 5’K. bre mRNA ---------------------------------------------------------------HEDFLLLMKSHLQSYPCPP. pis mRNA ---------------------------------------------------LIPNFSFYCIYRITYFVLLMKSHLQSYPCPK. mic mRNA --------------------------------------------------------------LELHFVLLMKSHLQSYPCPK. mic gDNA FYIIFP.HSFYFYKTPEIPEFFYFVISLFSFCNLVTQHLISLLFLFNLNGSYNISLISSFLSLELYFVLLMESHLQSYPCPP. pis gDNA FQGLYFLKLINV.MKMNLQSNGSLNW.RQTTVDNDL.WIPDLIFYHICNCLLIPNSSFYCIYRITYFVLLMKSHLQSYPCPC. coh gDNA FSIYSYYLLVGQKSGHWFVGPTLGQCVAGHYVQHSFYLLGMKPKQFFYSLGHVAKCFTSGPVVQISFIFLMKSHLHTYPCPP. fal gDNA ----------------------------------------------------------FIVFMNFYSINLVKAHLINYPCPCox3 5’K. bre mRNA --------------TRLIFKTGICFSIHQEVASGPFCLLVNSPWLLVFALLFFQTALG-LNLYCWKGIHFSWSLDFIFLCLK. mic mRNA ----------------QLLYFGFSNSIHQEVASGPFCLLVNSPWLSVFALLFFLYVLG-LNLYCWNGIHFSWSLDFVFFCLG. pol mRNA HEPGERLCFLCFIEEISAWRLVFWNSIHLEVASGPFCFLLYSSWLIVFVLCVFEHYFSFINLYCWKGLHFSWNNFLIFIFIC. mer gDNA --------------------MSNLNSNLAIYNRHPFHLVDPSPWPFMASLSVLVFLFG-----LVSYLHGFKVGNFLFVFGR. ame gDNA -------------------------MSQTFVKKHPYHIVDQSPWPLLTSIGTLCSTFG-----GVMYFHSYPNGGFIAALGN. oli gDNA ---------------------- ---MSSHAPQHPFHLVDPSPWPIFGSLAAFVTTSG-----GVMYMHSYSGGRIMFPLGH. sap gDNA ----------------------------MTHQSHAYHMVKPSPWPLTGALSALLMTSG-----LAMWFHFHS--MTLLVLGP. fal gDNA ------------------------FILFSNLSNIKAH-LVSYPALTSLYGTSL-KYFS-----------------VGILFTCox1 5’P. pis mRNA ---------------------------------------------------------FISLLKNCNHKRLGIYYLLSAFIFGISK. bre mRNA -----------------------------------------------------QRIFFLSLVKNCNHKRLGIYYLLSAFIFGVSK. mic mRNA -------------------------------LISFSLSLSLLFYFLWNINKSSQRIFFLSLVKNCNHKRLGIYYLLSAFIFGVSK. mic gDNA ----LFPFPYHVCYFIFFGILIRFNKSSQRISFLSLVCYFIFFGILIRFNKSSQRISFLSLVKNCNHKGLGIYYLLSAFIFGISC. coh gDNA ---MMELYSYYLWFVGLLAQHFIGLLAQQLLILSSDIIIVFWLAMGQTFVCWTLCPTFISSVKNCNHKGLGIYYLLSSFIFGISP. fal gDNA ---------------------------------------------------IFIVLNRYSLITNCNHKTLGLYYLWFSFLFGSYBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41that the longer cDNA is likely the functional cox3 mRNA,a genomic copy corresponding to it could not be ampli-fied from genomic DNA using multiple primer combina-tions (all of which successfully amplified thecorresponding fragments in RT-PCRs; data not shown).The longest product obtained from genomic DNA corre-sponded to nucleotides 50–712 of the full-length cox3sequence. Six genomic fragments containing cox3sequence were obtained by amplifying between genes,and these suggest that the gene is fragmented in thegenome (Figure 3B, xv, xvi, xvii, xviii, xix and xx). Notably,three unique cox3 genomic sequences are truncated atnucleotide 712, precisely where the short cox3 transcriptsare oligoadenylated (Figure 3B, xv, xvi and xx). Immedi-ately downstream is a stop codon, and subsequently nofurther sequence similarity to cox3. Similarly, the onlygenomic sequences found to encode the 3' end of the longtranscript are 5'-truncated at nucleotide 718, withsequence unrelated to cox3 upstream of this point (Figure3B, xvii and xviii). Taken together, these data suggest thatthe long cox3 transcript is the product of trans-splicing,where nucleotides 1–712 are joined to nucleotides 718–839 arising from two different genomic fragments. Theintervening five nucleotides (713–717) are all A residuesin the full-length cox3 transcript, suggesting that trans-splicing occurs within the oligo(A) tail of the upstreamtranscript.Mitochondrial rRNAs are fragmented in a similar pattern as in apicomplexansSSU and LSU rRNAs are encoded in all characterized mtD-NAs; however, until recently [17] no mitochondrial rRNAsequences had been described from dinoflagellates. Inthis study we have identified several discrete, shortsequences with strong similarity to components of thehighly fragmented rRNAs of apicomplexans [14] (Gen-Bank acc. no. M76611). From K. micrum, we obtainedcDNA sequences representing five LSU rRNA fragments(LSUA, RNA2, LSUE, LSUG, and RNA10), one SSU rRNAfragment (RNA8), and one unassigned rRNA fragment(RNA7), all of which correspond to known transcriptionalunits of the Plasmodium mitochondrial genome. We alsoidentified an additional LSU rRNA fragment, LSUF, aswell as LSUE and LSUG, from an EST survey we previouslyconducted in Heterocapsa triquetra [26]. Alignment ofLSUA, LSUE, LSUF, LSUG and RNA10 to their PlasmodiumLSU homologues is shown in Figure 5. SSU rRNA frag-ment RNA8 and unassigned fragment RNA7 share 66%and 74% sequence identity to Plasmodium homologues,respectively. For each fragment, multiple cDNAs weresequenced (with the exception of RNA2 and LSUG), andoligoadenylation was found to occur at a consistent site(Figure 5). Although these cDNAs are all relatively short,genomic copies (where they are known) encoded con-served sequence upstream of the 5' ends of cDNAs ofLSUE and LSUG (Figure 5).For C. cohnii, the LSUG sequence identified on EcoRI clonepc3#2.2 was analyzed by 3' RACE and the site of oligoad-enylation was shown to be identical to that in the corre-sponding K. micrum and H. triquetra cDNAs (Figure 5).Northern analysis of C. cohnii RNA showed a single LSUG-positive band at ~108 nucleotides [27]. This size corre-sponds well with the limit of conservation among LSUrRNA sequences, as well as the size of the PlasmodiumLSUG. C. cohnii LSUE was also amplified and the endsdetermined by 5'-cDNA sequencing and 3' RACE (Figure5). Northern hybridization against mitochondrial RNAconfirmed the presence of an ~200 nucleotide RNA spe-cies [27].The oligoadenylation sites for mitochondrial rRNA frag-ments are identical among dinoflagellates, and eitheridentical or within a few nucleotides of those observed inPlasmodium (Figure 5). The 5' ends of these sequences,whether defined experimentally (LSUE and LSUG from C.cohnii) or by sequence conservation, are also very similarto those of their Plasmodium counterparts. The only possi-ble exception is K. micrum RNA2, where the sole cDNAobtained contained substantial upstream (305 nucle-otides) and downstream (79 nucleotides) sequence com-pared to the region with similarity to Plasmodium RNA2.However it is possible that this cDNA represents anunprocessed precursor, and accordingly further work isrequired to substantiate the size of this putative rRNA frag-ment. Secondary structure predictions for dinoflagellatesequences LSUA, LSUE, LSUF, LSUG, RNA10 and putativeRNA2 (limited to the region of similarity to the Plasmo-dium RNA2) all indicate that the expected folding andintermolecular base pairings occur (Figure 6), and thesefragments are likely to contribute to a viable reconstitutedLSU rRNA, as for Plasmodium.RNA editingProtein-coding genesRNA editing has been described for cox1, cob and cox3transcripts from diverse dinoflagellates, including the cobmRNA of K. micrum [18-20]. Comparison of K. micrumcDNA and corresponding mtDNA sequences for the threegenes identified here confirms this conclusion for tran-scripts of cob, and further shows that cox1 and cox3 tran-scripts are also edited. The average density of editing of thecox1 transcripts is one substitution per 36 nucleotides andthis value is consistent with other studies in different spe-cies [18,19]. By contrast, editing in cox3 transcripts is overtwice as dense, at one substitution per 17 nucleotides,Page 9 of 17(page number not for citation purposes)the 5' ends could not be definitively determined fromthese cDNAs because the 5'-lengths were variable. Further,making cox3 the most heavily edited gene transcript inBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41dinoflagellates. Editing of cob mRNA lies in between theseextremes, at one substitution per 25 nucleotides.In the case of cox1 transcripts, four types of substitutionalchanges were detected at 42 sites. Of these, 48% were A toG substitutions, followed by U to C (21%) and smallerproportions of C to U and G to C edits (17% and 14%,respectively). This observation is consistent with cox1mRNA editing occurring in other species, where most(80%) of the reported changes are A to G and U to C sub-stitutions [18,19]. So far, G to C changes have only beenobserved in mtDNA-encoded mRNAs of dinoflagellates,whereas A to G changes have only been reported innucleus-encoded mRNAs. cox3 mRNA editing types areobserved at 50 sites, of which 42% were A to G changes,followed by C to U and U to C edits (28% and 22%respectively), as well as three G to A edits (6%) and a sin-gle G to C edit (2%). For both cox1 and cox3 mRNAs, themajority of substitutions occur at the first or second posi-tions of affected codons (88% and 96%, respectively), andover 90% of editing events result in a change in predictedamino acid. In K. micrum cox3 mRNA (and cox1 and cobmRNAs of other dinoflagellates [18,19]), editing alsoremoves a UAG codon, which is typically a stop codon butis apparently unassigned in dinoflagellates.Analysis of the 20 cDNAs corresponding to cox3 and coboffers further insight into the process of RNA editing inDinoflagellate LSU rRNA sequences aligned to those of their fragmented apicomplexan counterpartsFigure 5Dinoflagellate LSU rRNA sequences aligned to those of their fragmented apicomplexan counterparts. Intact LSU rRNAs from the mitochondrion of a ciliate and plant and from a bacterium are included in the alignment. Color groups indicate distinct rRNA cDNAs with oligoadenylation shown in italics. K. micrum genomic sequence (gen) is included for LSUE, LSUG and RNA10 (lowercase sequence denotes primer sites used for RNA10 gen). Yellow highlights differences between K. micrum genomic and cDNA sequences. Red box indicates the conserved domain of the sarcin/ricin loop represented in RNA10 sequences. K.mic, Karlodinium micrum; H.tri, Heterocapsa triquetra; C.coh, Crypthecodinium cohnii; P.fal, Plasmodium falciparum M76611; A.tha, Arabidopsis thaliana, Y08501; T.pyr, Tetrahymena pyriformis, M58010; E.col, Escherichia coli, D12649.K.mic LSUA 1 ---------------------------------------------------------------CCTCGTGGCAAGAGATCTAGGTTACGTCTAA--------------------GGAAAAAAGAAAAGTCCAGGGA 43P.fal LSUA 1 -----------------------------------------TTATAGCCATGTCTCCA-TGAACTATAAAACATGTGATCTAATTACAGAACAG------------------GAAAATAATAGACCGAACCTTGGA 48T.pyr 212 TTTATAA--------TAAAATAATATTACGAATCGATAG--AAAATTAGTTAATTATATAAGACCCGAAGCTAAGTGATCTAATTATGGTTAGATTAAGGGT-------ATTTATACCTAAGGATCGAACTCTTAA 337A.tha 65 GCCATTAGGTG----TAGGCGCTTTCCAAAGGTGGAAT----CTTCTAGTTCTTCCTATTTGACCCGAAACCGATCGATCTAGCCATGAGCAGGTTGAAGAGAGCTCTAACAGGCCTTGGAGGACCGAACCCACGT 200E.col 602 CCGAATAGGGGA-GCCGAAGGGAAACCGAGTCTTAACTG-GGCGTTAAGTTGCAGGGTATAGACCCGAAACCCGGTGATCTAGCCATGGGCAGGTTGAAGGTTGGG-TAACACTAACTGGAGGACCGAACCGACTA 742K.mic LSUA 44 ATCTTGAATGCTTCTTCTAAGATTTGTTTCTTGGTGGTTAACGGTCAATCATTCTTGGTTATAGACGGTTCTCTGTTAAATCTCTTCTGGTTTGAAAAAAAAAA--------------------------------P.fal LSUA 49 CTCTTAAAATATTCTTGGAAGATTCGTAA-TTAGTGGTTAAAGGTCAATCAAACATGAATATAGACGGTTTTCTGCGAAATCTATTTGGAAGATATATCATAAAAAAAAAA-------------------------T.pyr 338 ATGTTGCAAAATTTTGGGATAAACTGTAA-TTAGGGGTGAAAGGCTTATCAAACTTAGTTATAGCTGGTTTTCCACGAAACCTATTTAAGTAGGGTGATATTTTATTATAAAATTAGGTTTAAATAACTATATCTA 472A.tha 201 ATGTGGCAAAATACGGGGATGACTTGTGG-CTAGGGGTGAAAGGCCAACCAAGATCGGATATAGCTGGTTTTCCGCGAAATCTATTTCAGTAGAGCGTATGATGTCGATGG--CCCGAGGTAGAGCACTCAATGGG 333E.col 743 ATGTTGAAAAATTAGCGGATGACTTGTGG-CTGGGGGTGAAAGGCCAATCAAACCGGGAGATAGCTGGTTCTCCCCGAAAGCTATTTAGGTAGCGCCTCGTGAACTCATCT--CCGGGGGTAGAGCACTGTTTCGG 875K.mic LSUE 1 -----------------------------------AAGGTTGGTCCTAAGGTAGCAAAATTCCTTGACAGGTAAGTTCCGTCCAGCATGAGCGGTGTAACGACTTCATCACTGTCACTAGCCT-GGTCTC-TCAGA 101K.mic LSUE gen ---------GATGATATTTTCACGGCGGCTGTCAGAAGGTTGGTCCTAAGGTAGCAAAATTCCTTGACAGGTAAGTTCCGTCCAGCATGAGCGGTGTAACGACTTCATCACTGTCACTAGCCT-GGTCTC-TCAGAH.tri LSUE 1 --------------------------------------AATGGTCCAAAGGTAGCAAAATTCCTTGACAGGTAAGTTCCGTCCAGCATGAGCGGTGTAACGACTTCCTCACTGTCACTAGCCT-AGTCTC-TACGA 99C.coh LSUE 1 ------------------CACATGGCGGCTGTCTNNTTAACGGTCCTAAG-TAGNNNAATTCCTTGATATGTAAGTTCCGTCCAG-ATGAGAGGTGTAATGACTTCCTCACTGTCACTAGCTT-AGTCTC-TGAGA 118P.fal LSUE 1 ---------------TGATAAACGGCGGCTGTATTTTAAACGGTCCTAAGGTAGCAAAATTCCTTGTCGGGTAATCTCCGTCCTGCATGAACGGTGTAACGACTTCCCCATTGTCGCTAGTGTGAGACTCCTAATA 119T.pyr 1310 TAAAATTTTAAACCCCAGTAAACGGCGGCCGTAACCCTGACGGTCCTAAGGTAGCAAAATTCCTTGGCGGGTAAGTTCCGTCCTGCATGAATGGTGTAACGACTGCTCTGCTGTCTCCAATACTAG-CTC-TACGA 1452A.tha 1496 TTTGAATGGAAGCCCCGGTAAACGGCGGCAGTAACTCTAACTGTCCTAAGGTAGCGAAATTCCTTGTCGCATAAGTAGCGACCTGCACGAATGGTGTAACGACTGCCCCGCTGTCTCCGACAT-GGACCC-GGTGA 1638E.col 1872 TCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTGTCTCCACCCG-AGACTC-AGTGA 2014K.mic LSUE 102 AATTGACTCATCCTTGATTACGAGGAAG-CCAACGGCCAGACGGTAAGACCCTGAGCACCTTTCCTTCTCTTAAAAAAAAAAK.mic LSUE gen AATTGACTCATCCTTGATTACGAGGAAG-CCAACGGCCAGACGGTAAGACCCTGAGCACCTTTCCTTCTCTT----------------------------------------------------------------H.tri LSUE 100 AATTGAGTAATCCTTT-TTACGAGGAAA-GCAACGGCCAGACGAAAAGACCCTGAGCACCTTTCCTTCT-CTAAAAAAAAAAC.coh LSUE 119 AATTGAGTCATCCCTGATTACGAGGAAG-CCAACGGCCAGATGAAAAGACCCTGTGCACCTTTCCTTCT-CTAAAAAAAAAAP.fal LSUE 120 AATAGAATTATCCATGAATATGTGGAAT-CATACGGCCCGACGGTAAGACCCTGAGCACCTTAACTTCC-CTAAAAAAAAAAT.pyr 1453 AATTGAATTTTCCGTGAAGATGCGACAATATTACAACTAGACGGGAAGACCCTATGCACCTTTACTGTTATCTGTAAATA---ATTTTTTTTTATAATTAACTAGACAAGTAGGAAA-TTTATATTAAAAATGGAA 1585A.tha 1639 AATTGAATTCTCCGTGAAGATGCGGAGTACCAACGGCTAGACGGTAAGACCCCGTGCACCTTCACTATAGCTTCGCAGTGACAACCTTGATCGA--ATGTGTAGGATAGGTGGGAGGTCGTGA-----------CA 1759E.col 2015 AATTGAACTCGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTGA--ACATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGACGCCA 2149H.tri LSUF 1 1- -------------------------TTTGATTTGT-GGTTCGCCAGGGATAACAGGTTCTTGTATCCTGAGAGCTCCTATGGAAGGATACCCGCGGCACCTCCATGTCGGCTCATCAGSGGCCCAAAAAAAAAAP.fal LSUF 2 1- --GAGATAATGTGCCGTAAACATATAACGGTAAGAAGGTTCGCCGGGGATAACAGGTTATAGTATATATAGAGCTCTAATCTTTATATACTATTGGCACCTCCATGTCGTCTCATCGCAGCCTTGCAATAAAAAAAAT.pyr 1818 ATAATTTTGTAGAAAATATATCGATCAACGAATAAAAGGTACGCTAGGGATAACAGGCTTATGAGTTTTGAGAGTTCTTATTAATAAACTCGTTTGGCACCTCGATGTCGGCTCATCACATCCTGATG-------- 1945A.tha 2011 AGTCCCGTGTGGAAGGGCTCTCGCTCAACGGATCAAAGGTACGCCGGGGATAACAGGCTGATGACTCCCAAGAGCTCTTATCGACGGAGTCGTTTGGCACCTCGATGTCGACTCATCACATCCTGGGG-------- 2138E.col 2400 GGTTCTGAATGGAAGGGCCATCGCTCAACGGATAAAAGGTACTCCGGGGATAACAGGCTGATACCGCCCAAGAGTTCATATCGACGGCGGTGTTTGGCACCTCGATGTCGGCTCATCACATCCTGGGG-------- 2527K.mic LSUG 1 1 ------------------------------------------CACCCTTAGAAGCGATTCCTGAGCTGGGTTGATAGCGTCTTGAGGCAGTTTTTACCCTATGGTCCGGAAGAAAAAAAAAK.mic LSUG gen --TTCCCATTCAAAGAGAAGAAGATTCCATGGGAAGGAGAGTTCACCTTTAGAAGCGATTCCTGAGCTGGGTTAATAGCGTCCTGAGGCAGTTTCTACCCTATGACCCAAAAG------------------------H tri LSUG 1 1 ----------------------ATTTTATGGGAAGGAGAGTTCACCCTTAGAAGCGATCCGTGAGCTGGGTTGATAGCGTCTTGCGGCAGTTTTTATCCTATGGTCCGTTATAAAAAAAAAAC.coh LSUG 1 1 -------------TTGTAGAAGATTCCATTGGAAGGAGAGTTCACCCTTAGAAGCGATCCGTGAGCTGGGTTAATAGCGTCCAGCGACAGTTTCTAACCAATGGTCCGAATTAAAAAAAAAAP.fal LSUG 1 ------TTTGAACTTGAACAAGGTTCCATTGGAATGAGAGTTCACCGTTAGAAGCGATGCGTGAGCTGGGTTAAGAACGTCTTGAGGCAGTTTGTTCCCTATCTACCGTTTTAAAAAAAAAAT.pyr 1946 -------------GTGGACAATCTATCAAGGGTCCGGCTGTTCGCCGGTTAAAGTGGTACGTGAGCTGGGTTTAAAACGTCGTGAGACAGTTTGGTCCCTATCTGTTGTAATTA---------------------- 2046A.tha 2139 -------------TTGAAGAAGGTCCCAAGGGTTCGGTTGTTCGCCGATTCAAGTGGTACGTGAGTTGGGTTTAGAACGTCGTGAGACAGTTCGGTTCCTATCTACCGTTGGTGTTAA------------------ 2243E.col 2528 -------------CTGAAGTAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGTGGTACGCGAGCTGGGTTTAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCG---------------------- 2628K.mic RNA10 1 ----------------------GGTTGGCAAAGGCGCTACGCTGTCGCTAATGAAGAGGTTGATGGTGTACGAAAGGAAAAGGAAAGG-TTAACCACTGCTAAAAAAAAAAK.mic RNA10 gen - TGGTTCTGGAGTTCTGGTTTCCACGGTTGGCAAAGGCGCTACGCTGTCGCTAATGAAGAGGTTGATGGTGTAcgaaaggaaaaggaaagg-TTAACCACTG-----------------------------------P.fal RNA10 1 - TATGTCCTGTTTCAAATATATATATGAATAATTGTACGAATAGACAATTGTGTTCATAGCTAGAGTACGTAAGGAAAAGGAAAGG-TTAACCGCTATCAAAAAAAAAAT.pyr 2047 ------------------------------------------TAAGAAAATAAATAAGAATTAACTTTAGTACGAGAGGACTAGGAAAAT-TTAATCACTGGTTTGAAAATTATTTTAATAAATAAAAGTATGGTT 2141A.tha 2244 ------------------------------------------AGGGAGAACTGCGAGGAGCCAACCCTAGTACGAGAGGACT-GGGTTGGGCCAACCTATGGTGTACCGGTTGTTATGCCAA-TAGCAGCGCCGGG 2338E.col 2629 ------------------------------------------CTGGAGAACTGAGGGGGGCTGCTCCTAGTACGAGAGGACC-GGAGTGGACGCATCACTGGTGTTCGGGTTGTCATGCCAA-TGGCACTGCCCGG 2723Page 10 of 17(page number not for citation purposes)generally consistent with those observed in cox1 and cobmRNAs. Five types of substitutional changes weredinoflagellates. Despite overall uniformity of transcriptediting, some cDNAs exhibit pre-edited states. K. micrumBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41cox3 and cob contain 50 and 44 editing sites, respectively,with the cDNAs analyzed here representing in total 343and 231 potential editing events, respectively. However atnine of these sites in the cox3 cDNAs, and five in the cobcDNAs, the pre-edited nucleotide occurs, indicating 2.6%and 2.2% 'non-edits', respectively. These 'non-edits' werepresent in only a few cDNAs (two and three for cob andcox3, respectively), suggesting that the great majority ofcDNAs represent mature transcripts. The pre-edited sitesare scattered throughout the transcripts where they arefound, occur between other edited sites, and in no obvi-ous order in any sequence. These pre-edited sites mayindicate editing failures, in which case such transcriptscould give rise to defective translation products. Alterna-tively, they may represent editing intermediates. If the lat-edited mitochondrial cDNAs have also recently beenfound in A. carterae mtDNA [20].rRNA transcriptsComparisons of rRNA cDNAs to genomic sequences areconstrained by the smaller sizes of these sequences (forexample 63 nucleotides for RNA7), in particular wherePCR has been used to amplify genomic sequence a greaterportion of this sequence represented primer binding sitesand therefore cannot be used in such a comparison. Nev-ertheless, from the available data, there is no evidence ofediting of RNA8, RNA10 or RNA7. For LSUE, completegenomic sequence (170 nucleotides) was available fromthe internal regions of five PCR fragments, with the major-ity of the sequence available from a further four PCR prod-ucts using LSUE primers. These sequences were identicalto the cDNAs except for three consecutive nucleotides thatwere absent in two of the three LSUE cDNAs obtainedfrom the EST survey. To test this anomaly, a further fivecDNAs were independently generated, and these all con-tained the three nucleotides, and therefore were identicalto genomic LSUE sequences and to one of the original ESTsequences. These results suggest that the three-nucleotidedeletions seen in two cDNAs represent a rare artifact,likely generated during reverse transcription, and that K.micrum LSUE is likely also not edited.There was, however, evidence of substitutional editing forLSUA and LSUG. In both cases genomic copies of thesesequences differed from transcripts: in LSUG at eight posi-tions and in LSUA at six positions (Figures 5 and 6). Con-sistent with the protein-coding genes, these substitutionsconsist mainly of A to G (36%), C to U (43%) and U to C(14%) substitutions, with one case of C to G. Given thatdinoflagellate mitochondrial genes occur in multiple cop-ies, recovery of further, independently isolated copies ofthese genes will be required to substantiate these infer-ences of rRNA editing. Evidence for rRNA editing has alsorecently been reported with the dinoflagellate A. catenella,where two inferred editing events were identified for the'LSUE-like' rRNA [17].DiscussionPrior to this study our view of the dinoflagellate mito-chondrial genome was gleaned from relatively sparsemolecular data obtained from several diverse dinoflagel-late taxa. These data nevertheless provided a tantalizingview of a mitochondrial genome displaying several eccen-tricities. Coding sequences for entire or partial versions ofcox1, cob or cox3 have been shown to occur in multiplecopies and in different genomic contexts in C. cohnii[15,27], G. polyedra [16], P. piscicida [28], and A. catenella[17]. These data paint a picture of dinoflagellate genomesPredicted secondary structures of dinoflagellate mitochon-d ial LSU rRNA fragmentsFigure 6Predicted secondary structures of dinoflagellate mitochondrial LSU rRNA fragments. RNA sequences were deduced from RNA and DNA sequences, and struc-tures were modelled on the secondary structure of E. coli LSU rRNA. Fragments correspond to K. micrum RNA2, LSUA, LSUE, LSUG and RNA10, and H. triquetra LSUF. Note that the potential hairpin at the 5' end of RNA10 does not have a counterpart in E. coli LSU RNA. Only a portion of the RNA2 cDNA sequence is shown; also, the actual 5' terminus of LSUA (and LSUF) likely extends past the sequence shown. Positions of the dinoflagellate fragments are mapped onto the full E. coli LSU rRNA structure, inset. Putative Watson-Crick and wobble base pairs are indicated by lines and dots respec-tively, GoA pairs by open circles, and non-canonical pairs by closed circles. Positions enclosed by a circle are editing sites, with the post-edited nucleotide shown. Oligoadenylation is indicated by italics. Helices are numbered according to the E. coli 23S rRNA structure [61].Page 11 of 17(page number not for citation purposes)ter is the case these data suggest that editing does notoccur in a linear sequence along each transcript. Pre-in sharp contrast to the minimalist 6 kb apicomplexanmtDNA, which encodes single copies of these genes,BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41tightly packed together [13]. Similarly, extensive RNAediting has been described in mRNAs from diverse dino-flagellates [18-20], a process that does not occur in api-complexans. In this study we have generated a much morecomprehensive body of mitochondrial genomic and tran-script data for two dinoflagellate species, C. cohnii and K.micrum, and these data are bolstered by a concurrent mito-chondrial genomic study of the dinoflagellate A. carterae[20]. Together, these results reinforce the view that thedinoflagellate mitochondrial genome has diverged radi-cally in form from that of apicomplexans, despite the per-sistence of some intriguing similarities.Mitochondrial genome content and formCompared to the complement of 43 to 52 genes in themitochondrial genome of ciliates [12], the most basalmember of the phylum Alveolata, the very low informa-tion content of apicomplexan mtDNA (three protein-encoding genes – cox1, cox3 and cob – and ~23 short tran-scription units that encode the functional SSU and LSUrRNAs) clearly shows that there has been considerablemitochondrial gene loss and/or relocation to the nucleusduring alveolate evolution. We infer that much, if not all,of this gene relocation must have occurred prior to the lastcommon ancestor of dinoflagellates and apicomplexans.In EST surveys, we have only identified the same threeprotein-coding genes (cox1, cox3 and cob); moreover, wefound no other mitochondrial ORFs of known function in> 28 kb and > 14 kb of mtDNA sequence from K. micrumand C. cohnii, respectively. These findings are consistentwith the previous demonstration that cox2, an otherwisenearly ubiquitous component of mitochondrial genomes,has been relocated to the nucleus in both apicomplexansand dinoflagellates [25,29]. The only additional genes weidentified are ones representing the mitochondrial SSUand LSU rRNAs, which together with cox1 and cob are uni-versally present in mtDNA. No tRNA genes have beenfound linked to mtDNA sequences, similar to apicompl-exans, where tRNAs are apparently imported into mito-chondria from the cytoplasm [13].Dinoflagellates and apicomplexans also share the charac-teristic of highly fragmented SSU and LSU rRNAs. Frag-mentation of mitochondrial rRNA genes has beendocumented in the mitochondrial genomes of severaleukaryotes, including ciliates [30,31], several green algae[8,32-36] and a fungus [37]. The degree of fragmentationin apicomplexan mitochondrial rRNA is more extremethan in these other cases, with 23 fragments for the SSUand LSU rRNAs reported to date, coding regions for whichare rearranged and interspersed with other genes in thegenome [14]. From within three disparate dinoflagellatetaxa we have identified eight rRNA fragments similar tocatenella and O. marina [17,21]. The dinoflagellate rRNAfragments mostly appear to correspond to their Plasmo-dium counterparts in length and sequence termini, sug-gesting that a stable level and pattern of fragmentation hasbeen inherited from the common ancestor of dinoflagel-lates and apicomplexans. Given that ciliate mitochondrialrRNAs are comparatively intact (encoding bipartite SSUand LSU rRNAs, and with only the fragmented LSU rRNAgene rearranged; see [12]), the extreme fragmentation indinoflagellates and apicomplexans must have occurredsince their divergence from ciliates.Despite a similar gene content the arrangement of dino-flagellate and apicomplexan mitochondrial genomes isradically different. Where the apicomplexan genome isrelatively simple and compact, the dinoflagellate mito-chondrial genome is complex, with multiple copies ofeach gene imbedded within different genomic contexts.Gene fragments and non-coding regions are also repeated,altogether suggesting a great deal of recombination in thegenome, which is also consistent with the lack ofsequence divergence among the multiple copies of theseelements. Shotgun sequence data recently published forthe A. carterae mitochondrion corroborate this picture ofa recombining complex genome, and further suggest thatthe majority of the mitochondrial genome (~85%) mightbe non-coding [20].Gene expression in dinoflagellate mitochondriaWithin the K. micrum EST survey, long cDNAs thatencoded several mitochondrial genes or gene fragments(the longest being 5854 bp) were noted. By contrast, mostmitochondrial cDNAs we recovered encoded a singlegene, suggesting the longer transcripts may be rapidlyprocessed into shorter molecules. Polycistronic transcriptsup to 5.9 kb are also known from apicomplexan mtDNA,these are rapidly processed to short, single-gene tran-scripts [38]. Interestingly, the polycistronic transcriptsfrom K. micrum are not edited, indicating that RNA editingacts on the individual gene transcripts.The use of alternative initiation codons in dinoflagellatemitochondrial genes is consistent with what is seen in themitochondria of other alveolates. In Plasmodium species,cox1 and cox3 lack an in-frame ATG, and while cob doescontain a ATG near the initiation site, it is uncertainwhether initiation occurs at this site or upstream of it [24](as in the case of dinoflagellate cob). ATT and ATA havebeen proposed as alternative initiator codons in Plasmo-dium species [39] (as well as some animal, fungal andalgal mitochondrial genes [9,40]). Several mitochondrialgenes from the ciliate Tetrahymena pyriformis also appar-ently use alternative initiation codons of the form ATN orPage 12 of 17(page number not for citation purposes)fragments in P. falciparum, and three of these rRNA specieshave also recently been reported from two further taxa, A.NTG: in the case of cob an ATG within eight codons of thepredicted N-terminus is apparently ignored, with GTGBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41used in its place [41]. Thus, there are precedents for reli-ance on codons other than ATG for translation initiationwithin alveolates. Potential initiator ATN/NTG codonsexist in all three Karlodinium mitochondrial genes; how-ever, a broader survey of dinoflagellates or analysis of pro-tein sequences will be necessary to identify the most likelycandidates.An absence of stop codons is more unusual. In T. pyri-formis all mitochondrial protein-coding genes terminatewith TAA [12]. TGA encodes tryptophan (as in severalmitochondrial systems [9,42]) and TAG is simply notused. All three Plasmodium mitochondrial protein-codinggenes also use TAA [24]. By contrast many dinoflagellatemitochondrial gene transcripts appear to lack any termi-nation codon. With only a single known exception (a cox3fragment from Lingulodinium polyedrum [16]), transcriptsare oligoadenylated upstream of any of the standard ter-mination codons, and RNA editing does not generate anin-frame stop. Further, in none of the transcripts is a sensecodon uniquely localized in the 3' region in such a way asto suggest that it serves as an alternative terminator (as in[8,43]). The oligoadenylation of K. micrum cox3 mRNAdoes produces a UAA codon, as is also the case for cox3transcripts for A. carterae and O. marina [21], that suggeststhat cox3, unlike cox1 or cob, might utilize conventionalstop. Such a mechanism for reconstituting a functionalUAA is known to occur in some mammalian mitochon-drial transcripts [40].It is unclear how the mitochondrial translation machinerymight cope with the absence of termination codons.Release factors that are essential for disassembly of theribosome usually recognize specific codons, so theabsence of these codons could block ribosome disassem-bly. There are precedents in other mitochondrial systemsfor the lack of termination codons: transcripts of twoplant mitochondrial genes have been shown to be oli-goadenylated upstream of in-frame stops [44]. Proteinsencoded by both of these genes can be detected, indicat-ing that the corresponding transcripts are successfullytranslated. In human mitochondria, a rare mutation hasbeen shown to ablate a stop codon, and yet the corre-sponding protein is still detectable is these cell lines [45].Eubacteria are known to be able to rescue damaged mRNAmolecules that have lost their termination codon by use ofa specialized RNA with properties of both a tRNA and anmRNA [46]. These so-called tmRNAs restart protein syn-thesis by providing a terminal mRNA section that encodesa functional stop codon. It has been speculated that anequivalent system might be used in plant and animalmitochondrial systems where mRNAs lack stop codons[44,45]. Indeed, tmRNA-like RNA species have been iden-like segment of a conventional tmRNA [47]. Moreover,the C-terminal tag provided by a tmRNA normally targetsthe modified protein for degradation rather than for func-tion [48]. Whatever the actual mechanism of translationtermination in dinoflagellate mitochondria, it appears topresent a clear difference with respect to protein synthesistermination in ciliate and apicomplexan mitochondria.Lastly, we have found a likely case of trans-splicing ofdinoflagellate mitochondrial transcripts, which adds afurther layer of complexity to genome organization andexpression in these organelles. While we cannot conclu-sively eliminate the possibility of a complete cox3 codingsequence in dinoflagellates we have not been able todetect an intact gene. This negative result is consistentwith all other studies to date, which report only partialcox3 sequences from five different dinoflagellate taxa [16-18,20,28] (note that the A. catenella cox3 is reported ascomplete [17], but it lacks approximately 300 nucleotidescompared with homologs in other dinoflagellates and inapicomplexans). All available data from genomic frag-ments and transcripts suggest that the complete cox3 tran-script is generated by trans-splicing. Such trans-splicinghas not been reported for either apicomplexan or ciliatemitochondria. In ciliates nad1 is split into two segments[12] but they are independently transcribed, and there isno evidence of splicing of the corresponding transcripts tocreate a continuous, complete nad1 ORF [41]. Trans-splic-ing occurs in plant mitochondria [49,50], but in thesecases the coding breakpoints are flanked by group IIintron elements, which form secondary structures thatmediate the splicing events. We have no evidence of groupII introns in dinoflagellate mtDNA, but we do note thatthe intergenic sequences contain numerous invertedrepeats consistent with extensive secondary structure,which might conceivably facilitate splicing events. Theunique nature of the dinoflagellate trans-splicing is alsoevident from the inclusion of five A residues at the spliceboundary that appear to derive from the oligo(A) tail ofthe upstream fragment. The removal of any downstreamsequence by oligoadenylation prior to splicing arguesagainst the involvement of a cis-acting element such as agroup II intron in the splicing process. It is conceivablethat oligoadenylation of the short 5' cox3 transcript couldserve as a degradation signal for these short transcripts, ashas been observed in human mitochondria [51]. How-ever, lack of a complete cox3 coding sequence, coupledwith the fact that the site of oligoadenylation correspondswith the break in coding sequence of 5' and 3' cox3 por-tions, suggests that the short cox3 transcripts are impor-tant intermediates in the generation of the complete cox3transcripts.Page 13 of 17(page number not for citation purposes)tified in the mitochondria of jakobid flagellates such as R.americana; however, these RNAs lack the terminal mRNA-BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41RNA editingThe RNA editing observed in K. micrum cox1, cob and cox3mRNAs is consistent with the level and type of editingobserved in cox1 and cob mRNAs in other dinoflagellatespecies [18-20], with the exception that cox3 is even moreheavily edited than either of cox1 or cob. While some edit-ing sites are conserved, others are unique to certain taxa,suggesting that new editing sites are constantly evolving indinoflagellates. In this study we also found evidence in K.micrum of editing of rRNA fragments LSUG and LSUA.RNA editing of A. catenella LSUE has also recently beenreported [17]. At present the data are insufficient to assessthe conservation of rRNA editing sites among taxa; how-ever, two inferred editing sites in A. catenella LSUE are notedited in K. micrum, suggesting that rRNA editing sites areconstantly evolving as with those in protein-coding genes.Whether RNA editing plays some functional role in dino-flagellate mitochondria is unclear. From analysis of pro-tein-coding genes in several dinoflagellates, Lin et al [19]noted that the majority of editing events are to either a Cor G, thus generating a net reduction in A+U content fromthe bias of ~70% for the coding sequences. We observethis trend also in Karlodinium protein-coding sequences.This re-tailoring of mRNA sequences might better accom-modate the suite of nucleus-encoded tRNAs that are likelyimported from the cytoplasm, and which typically partic-ipate in the decoding of nucleus-encoded mRNAs havinga more balanced A+U content [19]. Ribosomal RNA isalso sensitive to A+U content, with secondary structureelements such as hairpin loops better stabilized by G-Cthan by A-U pairs; thus, helical regions tend to be rela-tively more G+C rich than other rRNA domains. While theavailable data for rRNA editing are limited (14 editingsites), it is interesting that the editing types in rRNAs havean overall neutral impact on A+U content. Indeed the A+Tcontent of mitochondrial genomic sequence specifyingrRNAs is already much reduced (56%) compared to thatof the protein-coding genes. This observation might addweight to the notion that editing helps correct (at the RNAlevel) the A+T skew of protein-coding genes.The mechanism of RNA editing in dinoflagellate mito-chondria is also unknown; however the possibility of aguide RNA (gRNA)-assisted mechanism, similar to thatemployed in trypanosomatid mitochondria [52], hasrecently been suggested [20]. Nash et al [20] report thatgene fragments encoded in mitochondria sometimesencode the 'corrected' nucleotide at an inferred editing site(in 6 out of 25 sites for which they had data). Thus suchfragments could encode templates that direct the editingevents of full-length transcripts. We analyzed the K.micrum data for similar evidence of post-edited nucle-71 editing sites across the three protein-encoding genes,only one site in one of the fragments corresponds to a 'cor-rected' nucleotide seen in cDNAs at an inferred editing site(nucleotide 30 in the cob gene). An independent copy ofthe cob genomic sequence verified that this nucleotide dif-ference is genuine (not a PCR error). Hence this mightrepresent an example of an editing template in K. micrum;however, if gRNAs are responsible for all editing events, avery large number of additional fragments must exist todirect the remainder of the changes. Clearly further workis required to shed light on the mechanism of RNA editingin dinoflagellates.Future directionsA key question that remains is whether the observed diver-sity of dinoflagellate mitochondrial genes, gene frag-ments, and repetitive elements derives from a singlemtDNA molecule or from multiple chromosomes. A sim-ilar scenario of mitochondrial genes occurring as multiplecopies and fragments is seen in the ichthyosporean A. par-asiticum, a unicellular organism closely related to animals[4]. In this protist, several hundred small linear chromo-somes constitute the mitochondrial genome, each encod-ing a smattering of genes and partial genes. Diplonemids,members of the phylum Euglenozoa, also contain frag-mented genes on separate circular mitochondrial chromo-somes [6]. It is unknown whether either of these unusualsituations applies to the organization of the dinoflagellatemitochondrial genome; however, in this regard we maketwo preliminary observations. One is that long-range PCRwas unable to generate longer contiguous sequences link-ing the many mtDNA elements we report in this study.Rather, additional short unique gene linkages wereobtained, and it is clear that we have yet to sample the fulldiversity of gene combinations. Secondly, the presence ofindividual genes in partial tandem repeats (see Figure 3B,vi and viii) is consistent with minicircles, as seen in dino-flagellate plastid genomes [53]. If these cases representtrue minicircles, we have been unable to amplify a corre-sponding sequence to close these circles (note that Figure3B, vi and vii contain unique sequence relative to vi andviii, respectively). It is also possible, of course, that thetandem repeats that we observe are simply a consequenceof further recombination events, and the high diversity ofgene combinations.ConclusionA greater depth of sampling of dinoflagellate mitochon-drial DNA and mRNA has provided a clearer view of acomplex genome and many peculiarities of gene expres-sion. We find that the dinoflagellate mitochondrialgenome shares several features in common with themtDNA of its apicomplexan sister lineage, but also manyPage 14 of 17(page number not for citation purposes)otides represented in gene fragments. From five fragments(representing unambiguously truncated genes) that spannovel characteristics. Features in common for the two lin-eages are: (1) a very high level of gene relocation from theBMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41mitochondrion, (2) extensive rRNA gene fragmentationand dispersal, and (3) use of non-standard initiationcodons. Features unique to dinoflagellates are: (1) genecopy number expansion and reorganization, (2) loss ofstop codons from protein-coding genes, (3) mRNA trans-splicing, and (4) RNA editing of protein-coding and rRNAtranscripts. These data demonstrate a remarkable burst oforganelle genome evolution in dinoflagellates followingdivergence from Apicomplexa, and also challenge ourunderstanding of the mechanistic details of genomemaintenance and expression, most notably translationtermination.MethodsCell culture, nucleic acid extraction, and mtDNA cloningC. cohnii cells were cultured and nucleic acids extracted aspreviously described [54].K. micrum and H. triquetra werecultured as previously described [22,26] and genomicDNA was extracted using the DNEasy Plant Minikit (Qia-gen, Hilden, Germany). For C. cohnii, a fraction wasenriched in mtDNA by isolating mitochondria via subcel-lular fractionation. This fraction was hydrolyzed withEcoRI and ligated into pBluescript KS+ (Stratagene, CedarCreek, Texas, USA), following which plasmids were trans-formed into competent E. coli cells [55]. Hybridizationprobes 'cob' and 'cox3' (see Southern blot analysis,below) were used to identify positive clones by hybridiza-tion of colony lifts [56]. For K. micrum PCR was used toamplify mtDNA fragments using oligonucleotides (20–22nucleotides) designed from mitochondrial genes identi-fied from an EST survey [22] using TBestDB [57]. PCRproducts were cloned into pGEM® -T Easy vector(Promega, Madison, Wisconsin, USA) and fullysequenced. Additional primers were designed fromsequence derived from these products. Analysis of DNAsequences was performed with the software packageSequencher™ 4.2.2 (Gene Codes Corporation, Ann Arbor,Michigan, USA). Protein alignments were made with thesoftware packages Clustal X [58] and McClade (SinauerAssociates, Massachusetts, USA). New sequences havebeen submitted to GenBank (GenBank accession num-bers EF442995–EF443047, and AM773790–AM773803).Southern blot analysisFive hybridization probes were generated using PCR andrestriction products as template. The 'cob' probe (753nucleotides), corresponding to positions 1386–2138 inpcb#2 and encompassing most of the cob reading frame,was amplified by PCR using cob51 (5'-CTGTGGTCCAGA-TATCTTTC-3') and cob296 (5'-CTTCTAATGAATTATCTG-3') primers. 'cb1' (430 nucleotides) was generated by PCRfrom pcb#7 using primer sets P51 (5'-CTATCTAAATC-CTATAAACAATG-3'; positions 2411–2433) and P25 (5'-(5'-CTGCCAGAGAATTATTGGTTAAC-3') and M13reverse vector-based primer. 'cox3' was generated byBamHI hydrolysis of a cox3-containing clone previouslyprepared. The deduced amino acid sequence of this 300-nt fragment exhibited a high degree of identity with thatof cytochrome oxidase subunit 3 (Cox3) in P. falciparum(amino acids 272–289). All of these fragments were puri-fied from gels and used as templates in random hexamerradiolabelling as previously described [54]. A final South-ern hybridization probe, 'rnl' (specific for LSUG), con-sisted of an 18-mer oligonucleotide (5'-GGTTAGAAACTGTCGCTG-3') that was 5' 32P-end-labelled [56]. Unincorporated isotope was removed byspin chromatography using a Sephadex G-25 MicroSpin™column (Pharmacia,,,,,,, London, UK). Southern hybridi-zation and filter washing conditions were as previouslyoutlined [54] using RNase A-treated DNA samples toeliminate any RNA contamination.Transcriptional analysisK. micrum and H. triquetra transcripts were inferred fromcDNAs prepared as previously described for EST surveys[22,26]. Complete sequences were generated from cDNAsmaintained as frozen E. coli clones. RT-PCR was used toamplify mRNA sequences not represented in the initialEST survey (e.g. full length cox1). The 3' ends of transcriptswere inferred from oligoadenylation sites.For C. cohnii, 3'-end mapping of rRNAs was performedusing 3'-RACE. Briefly, isolated mtRNA was incubatedwith recombinant yeast poly(A) polymerase (USB) and0.5 mM CTP for 20 min followed by a 10 min incubationwith 0.5 mM ATP using the same conditions as previouslyoutlined [59,60]. cDNA synthesis was performed usingAMV reverse transcriptase (Promega) with an oligo(dT)primer (5'-AATAAAGCGGCCGCGGATC-CAATTTTTTTTTTTTTTTTVN-3') [61] following manufac-turer's protocols. The cDNA was used in PCRamplification with primers P4 (5'-AATAAAGCG-GCCGCGGATCCAA-3') and either LSUG4 (5'-AGAAGAT-TCCATTGGAAG-3') for LSUG, or LSUE4 (5'-AAGGTAGNNNAATTCCTTGATAGG-3') for LSUE. PCRamplification products were cloned into pT7Blue T-vector(Novagen) and sequenced. LSUE 5'-end sequence wasgenerated by cDNA sequencing using primer LSUE2 (5'-TTCATGCAGGACGGARMTTACCC-3'. Ribosomal RNAsequences were manually fitted to the Escherichia coli sec-ondary structure models [62] and the structure diagramswere drawn using the program XRNA (B Weiser and HNoller, personal communication).Abbreviationsnt, nucleotides; bp, basepairs; cDNA, complementaryPage 15 of 17(page number not for citation purposes)AAGGATTTGGTTTCTTGATG-3'; positions 2821–2840)and 'cb3' (716 nucleotides) from pcb#2 using primer P50DNA; rRNA, ribosomal RNA; kb, kilobase; mtDNA, mito-chondrial DNA; PCR, polymerase chain reaction; RT-PCR,BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/41reverse transcriptase polymerase chain reaction; RACE,rapid amplification of cDNA ends; LSU, large subunit;SSU, small subunit; ORF, open reading frame; EST,expressed sequence tag; tmRNA, transfer-messenger RNA;UTR, untranslated region.Competing interestsThe author(s) declares that there are no competing inter-ests.Authors' contributionsCJJ generated K. micrum and H. triquetra data, and draftedthe manuscript. JEN generated C. cohnii data. MNS mod-eled rRNA secondary structures. PJK provided access to K.micrum and H. triquetra EST data and cDNA libraries andcontributed to study conception. MWG contributed tostudy conception. RFW contributed to study conceptionand drafted the manuscript. All authors contributed todata analysis, manuscript revision and approved the finalmanuscript.AcknowledgementsWe would like to thank Nicola Patron for critically reading the manuscript and Claudio Slamovits for useful discussions. This project was supported by the Australian Research Council (grant No. DP0663590) and the Canadian Institutes for Health Research (CIHR MOP-4124). Salary and interaction support was received by PJK and MWG from the Canadian Institute for Advanced Research (CIAR), the Michael Smith Foundation for Health Research (PJK), and Canada Research Chairs (CRC) Program (MWG).References1. Gray MW, Lang BF, Burger G: Mitochondria of protists.  Annu RevGenet 2004, 38:477-524.2. Gray MW, Burger G, Lang BF: Mitochondrial evolution.  Science1999, 283:1476-1481.3. Lang BF, Burger G, O'Kelly CJ, Cedergren R, Golding GB, Lemieux C,Sankoff D, Turmel M, Gray MW: An ancestral mitochondrialDNA resembling a eubacterial genome in miniature.  Nature1997, 387:493-497.4. Burger G, Gray MW, Lang BF: Mitochondrial genomes: anythinggoes.  Trends Genet 2003, 19:709-716.5. Shapiro TA, Englund PT: The structure and replication of kine-toplast DNA.  Annu Rev Microbiol 1995, 49:117-143.6. Marande W, Lukes J, Burger G: Unique mitochondrial genomestructure in diplonemids, the sister group of kinetoplastids.Eukaryot Cell 2005, 4:1137-1146.7. Burger G, Forget L, Zhu Y, Gray MW, Lang BF: Unique mitochon-drial genome architecture in unicellular relatives of animals.Proc Natl Acad Sci USA 2003, 100:892-897.8. Nedelcu AM, Lee RW, Lemieux C, Gray MW, Burger G: The com-plete mitochondrial DNA sequence of Scenedesmus obliquusreflects an intermediate stage in the evolution of the greenalgal mitochondrial genome.  Genome Res 2000, 10:819-831.9. Swire J, Judson OP, Burt A: Mitochondrial genetic codes evolveto match amino acid requirements of proteins.  J Mol Evol2005, 60:128-139.10. Fast NM, Xue L, Bingham S, Keeling PJ: Re-examining alveolateevolution using multiple protein molecular phylogenies.  JEukaryot Microbiol 2002, 49:30-37.11. Van de Peer Y, De Wachter R: Evolutionary relationships amongthe eukaryotic crown taxa taking into account site-to-siterate variation in 18S rRNA.  J Mol Evol 1997, 45:619-630.12. Burger G, Zhu Y, Littlejohn TG, Greenwood SJ, Schnare MN, Lang BF,13. Feagin JE: The extrachromosomal DNAs of apicomplexanparasites.  Annu Rev Microbiol 1994, 48:81-104.14. Feagin JE, Mericle BL, Werner E, Morris M: Identification of addi-tional rRNA fragments encoded by the Plasmodium falci-parum 6 kb element.  Nucleic Acids Res 1997, 25:438-446.15. Norman JE, Gray MW: A complex organization of the geneencoding cytochrome oxidase subunit 1 in the mitochondrialgenome of the dinoflagellate, Crypthecodinium cohnii: homol-ogous recombination generates two different cox1 openreading frames.  J Mol Evol 2001, 53:351-363.16. Chaput H, Wang Y, Morse D: Polyadenylated transcripts con-taining random gene fragments are expressed in dinoflagel-late mitochondria.  Protist 2002, 153:111-122.17. Kamikawa R, Inagaki Y, Sako Y: Fragmentation of mitochondriallarge subunit rRNA in the dinoflagellate Alexandrium cat-enella and the evolution of rRNA structure in alveolate mito-chondria.  Protist 2007, 158:239-245.18. Zhang H, Lin S: Mitochondrial cytochrome b mRNA editing indinoflagellates: possible ecological and evolutionary associa-tions?  J Eukaryot Microbiol 2005, 52:538-545.19. Lin S, Zhang H, Spencer DF, Norman JE, Gray MW: Widespreadand extensive editing of mitochondrial mRNAS in dinoflag-ellates.  J Mol Biol 2002, 320:727-739.20. Nash EA, Barbrook AC, Edwards-Stuart RK, Bernhardt K, Howe CJ,Nisbet RE: Organisation of the mitochondrial genome in thedinoflagellate Amphidinium carterae.  Mol Biol Evol 2007,24:1528-1536.21. Slamovits CH, Saldarriaga JF, Larocque A, Keeling PJ: The highlyreduced and fragmented mitochondrial genome of the early-branching dinoflagellate Oxyrrhis marina shares characteris-tics with both apicomplexan and dinoflagellate mitochon-drial genomes.  J Mol Biol 2007, 372:356-368.22. Patron NJ, Waller RF, Keeling PJ: A tertiary plastid uses genesfrom two endosymbionts.  J Mol Biol 2006, 357:1373-1382.23. Gillespie DE, Salazar NA, Rehkopf DH, Feagin JE: The fragmentedmitochondrial ribosomal RNAs of Plasmodium falciparumhave short A tails.  Nucleic Acids Res 1999, 27:2416-2422.24. Rehkopf DH, Gillespie DE, Harrell MI, Feagin JE: Transcriptionalmapping and RNA processing of the Plasmodium falciparummitochondrial mRNAs.  Mol Biochem Parasitol 2000, 105:91-103.25. Waller RF, Keeling PJ: Alveolate and chlorophycean mitochon-drial cox2 genes split twice independently.  Gene 2006,383:33-37.26. Patron NJ, Waller RF, Archibald JM, Keeling PJ: Complex proteintargeting to dinoflagellate plastids.  J Mol Biol 2005,348:1015-1024.27. Norman JE: Mitochondrial genome organization, expressionand evolution in the dinoflagellate Crypthecodinium cohnii.  InPhD thesis Dalhousie University; 2000. 28. Zhang H, Lin S: Detection and quantification of Pfiesteria pisci-cida by using the mitochondrial cytochrome b gene.  Appl Envi-ron Microbiol 2002, 68:989-994.29. Hackett JD, Yoon HS, Soares MB, Bonaldo MF, Casavant TL, ScheetzTE, Nosenko T, Bhattacharya D: Migration of the plastid genometo the nucleus in a peridinin dinoflagellate.  Curr Biol 2004,14:213-218.30. Heinonen TY, Schnare MN, Young PG, Gray MW: Rearranged cod-ing segments, separated by a transfer RNA gene, specify thetwo parts of a discontinuous large subunit ribosomal RNA inTetrahymena pyriformis mitochondria.  J Biol Chem 1987,262:2879-2887.31. Schnare MN, Heinonen TYK, Young PG, Gray MW: A discontinu-ous small subunit ribosomal RNA in Tetrahymena pyriformismitochondria.  J Biol Chem 1986, 261:5187-5193.32. Boer PH, Gray MW: Scrambled ribosomal RNA gene pieces inChlamydomonas reinhardtii mitochondrial DNA.  Cell 1988,55:399-411.33. Denovan-Wright EM, Lee RW: Comparative structure andgenomic organization of the discontinuous mitochondrialribosomal RNA genes of Chlamydomonas eugametos andChlamydomonas reinhardtii.  J Mol Biol 1994, 241:298-311.34. Fan J, Lee RW: Mitochondrial genome of the colorless greenalga Polytomella parva: two linear DNA molecules withhomologous inverted repeat Termini.  Mol Biol Evol 2002,Page 16 of 17(page number not for citation purposes)Gray MW: Complete sequence of the mitochondrial genomeof Tetrahymena pyriformis and comparison with Parameciumaurelia mitochondrial DNA.  J Mol Biol 2000, 297:365-380.19:999-1007.Publish with BioMed Central   and  every scientist can read your work free of charge"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."Sir Paul Nurse, Cancer Research UKYour research papers will be:available free of charge to the entire biomedical communitypeer reviewed and published immediately upon acceptancecited in PubMed and archived on PubMed Central BMC Biology 2007, 5:41 http://www.biomedcentral.com/1741-7007/5/4135. Fan J, Schnare MN, Lee RW: Characterization of fragmentedmitochondrial ribosomal RNAs of the colorless green algaPolytomella parva.  Nucleic Acids Res 2003, 31:769-778.36. Turmel M, Lemieux C, Burger G, Lang BF, Otis C, Plante I, Gray MW:The complete mitochondrial DNA sequences of Neph-roselmis olivacea and Pedinomonas minor. Two radically differ-ent evolutionary patterns within green algae.  Plant Cell 1999,11:1717-1730.37. Forget L, Ustinova J, Wang Z, Huss VA, Lang BF: Hyaloraphidiumcurvatum: a linear mitochondrial genome, tRNA editing, andan evolutionary link to lower fungi.  Mol Biol Evol 2002,19:310-319.38. Ji YE, Mericle BL, Rehkopf DH, Anderson JD, Feagin JE: The Plasmo-dium falciparum 6 kb element is polycistronically transcribed.Mol Biochem Parasitol 1996, 81:211-223.39. Feagin JE: The 6-kb element of Plasmodium falciparum encodesmitochondrial cytochrome genes.  Mol Biochem Parasitol 1992,52:145-148.40. Anderson S, Bankier AT, Barrell BG, de Bruijn MH, Coulson AR,Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, et al.: Sequenceand organization of the human mitochondrial genome.Nature 1981, 290:457-465.41. Edqvist J, Burger G, Gray MW: Expression of mitochondrial pro-tein-coding genes in Tetrahymena pyriformis.  J Mol Biol 2000,297:381-393.42. Gray MW, Lang BF, Cedergren R, Golding GB, Lemieux C, Sankoff D,Turmel M, Brossard N, Delage E, Littlejohn TG, et al.: Genomestructure and gene content in protist mitochondrial DNAs.Nucleic Acids Res 1998, 26:865-878.43. Kück U, Jekosch K, Holzamer P: DNA sequence analysis of thecomplete mitochondrial genome of the green alga Scenedes-mus obliquus: evidence for UAG being a leucine and UCAbeing a non-sense codon.  Gene 2000, 253:13-18.44. Raczynska KD, Le Ret M, Rurek M, Bonnard G, Augustyniak H, Gual-berto JM: Plant mitochondrial genes can be expressed frommRNAs lacking stop codons.  FEBS Lett 2006, 580:5641-5646.45. Chrzanowska-Lightowlers ZM, Temperley RJ, Smith PM, Seneca SH,Lightowlers RN: Functional polypeptides can be synthesizedfrom human mitochondrial transcripts lacking terminationcodons.  Biochem J 2004, 377:725-731.46. Muto A, Ushida C, Himeno H: A bacterial RNA that functions asboth a tRNA and an mRNA.  Trends Biochem Sci 1998, 23:25-29.47. Jacob Y, Seif E, Paquet PO, Lang BF: Loss of the mRNA-like regionin mitochondrial tmRNAs of jakobids.  RNA 2004, 10:605-614.48. Keiler KC, Waller PR, Sauer RT: Role of a peptide tagging systemin degradation of proteins synthesized from damaged mes-senger RNA.  Science 1996, 271:990-993.49. Bonen L: Trans-splicing of pre-mRNA in plants, animals, andprotists.  FASEB J 1993, 7:40-46.50. Chapdelaine Y, Bonen L: The wheat mitochondrial gene for sub-unit I of the NADH dehydrogenase complex: a trans-splicingmodel for this gene-in-pieces.  Cell 1991, 65:465-472.51. Slomovic S, Laufer D, Geiger D, Schuster G: Polyadenylation anddegradation of human mitochondrial RNA: the prokaryoticpast leaves its mark.  Mol Cell Biol 2005, 25:6427-6435.52. Benne R: RNA editing in trypanosomes.  Eur J Biochem 1994,221:9-23.53. Zhang Z, Green BR, Cavalier-Smith T: Single gene circles in dino-flagellate chloroplast genomes.  Nature 1999, 400:155-159.54. Norman JE, Gray MW: The cytochrome oxidase subunit 1 gene(cox1) from the dinoflagellate, Crypthecodinium cohnii.  FEBSLett 1997, 413:333-338.55. Hanahan D: Studies on transformation of Escherichia coli withplasmids.  J Mol Biol 1983, 166:557-580.56. Ausubel F, Brent R, Kingston R, Moore D, Seidman J, Smith J, StruhlK: Current Protocols in Molecular Biology 1st edition. New York: JohnWiley and Sons; 1987. 57. TBestDB   [http://tbestdb.bcm.umontreal.ca/searches/login.php?bye=true]58. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: TheCLUSTAL_X windows interface: flexible strategies for mul-tiple sequence alignment aided by quality analysis tools.Nucleic Acids Res 1997, 25:4876-82.59. Lingner J, Keller W: 3'-end labeling of RNA with recombinant60. Zaug AJ, Linger J, Cech TR: Method for determining RNA 3'ends and application to human telomerase RNA.  Nucleic AcidsRes 1996, 24:532-533.61. Borson ND, Salo WL, Drewes LR: A lock-docking oligo(dT)primer for 5' and 3' RACE PCR.  PCR Methods Appl 1992,2:144-148.62. Cannone JJ, Subramanian S, Schnare MN, Collett JR, D'Souza LM, DuY, Feng B, Lin N, Madabusi LV, Muller KM, Pande N, Shang Z, Yu N,Gutell RR: The comparative RNA web (CRW) site: an onlinedatabase of comparative sequence and structure informa-tion for ribosomal, intron, and other RNAs.  BMC Bioinformatics2002, 3:2.yours — you keep the copyrightSubmit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.aspBioMedcentralPage 17 of 17(page number not for citation purposes)yeast poly(A) polymerase.  Nucleic Acids Res 1993, 21:2917-2920.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items