UBC Research Data

Data from: Untangling the early diversification of eukaryotes: a phylogenomic study of the evolutionary origins of Centrohelida, Haptophyta, and Cryptista Burki, Fabien; Kaplan, Maia; Tikhonekov, Denis V.; Zlatogursky, Vasily; Minh, Bui Quang; Radaykina, Liudmila V.; Smirnov, Alexey; Mylnikov, Alexander P.; Keeling, Patrick J.; Tikhonenkov, Denis V.

Description

Abstract
Assembling the global eukaryotic tree of life has long been a major effort of Biology. In recent years, pushed by the new availability of genome-scale data for microbial eukaryotes, it has become possible to revisit many evolutionary enigmas. However, some of the most ancient nodes, which are essential for inferring a stable tree, have remained highly controversial. Among other reasons, the lack of adequate genomic datasets for key taxa has prevented the robust reconstruction of early diversification events. In this context, the centrohelid heliozoans are particularly relevant for reconstructing the tree of eukaryotes because they represent one of the last substantial groups that was missing large and diverse genomic data. Here, we filled this gap by sequencing high-quality transcriptomes for four centrohelid lineages, each corresponding to a different family. Combining these new data with a broad eukaryotic sampling, we produced a gene-rich taxon-rich phylogenomic dataset that enabled us to refine the structure of the tree. Specifically, we show that (i) centrohelids relate to haptophytes, confirming Haptista; (ii) Haptista relates to SAR; (iii) Cryptista share strong affinity with Archaeplastida; and (iv) Haptista + SAR is sister to Cryptista + Archaeplastida. The implications of this topology are discussed in the broader context of plastid evolution.; Usage notes
Transcriptome assembly of Amastigomonas spRNA-seq assembly of Amastigomonas sp. from Genbank SRA accession #SRR2170627. Read quality was assessed with FastQC before and after quality trimming and SMART adaptors removal, which was performed with FastqMcf. Cleaned reads were assembled into contigs with Trinity r20140717 using default parameters.Amastigomonas_sp_transcriptome.fasta.zip
Transcriptome assembly of Raineriophrys erinaceoidesRNA-seq assembly of Raineriophrys erinaceoides from Genbank SRA accession #SRR2170634. Read quality was assessed with FastQC before and after quality trimming and SMART adaptors removal, which was performed with FastqMcf. Cleaned reads were assembled into contigs with Trinity r20140717 using default parameters.Raineriophrys_erinaceoides_transcriptome.fasta.zip
Transcriptome assembly of Choanocystis spRNA-seq assembly of Choanocystis sp. from Genbank SRA accession #SRR2170626. Read quality was assessed with FastQC before and after quality trimming and SMART adaptors removal, which was performed with FastqMcf. Cleaned reads were assembled into contigs with Trinity r20140717 using default parametersChoanocystis_sp_transcriptome.fasta.zip
Transcriptome assembly of Acanthocystis spRNA-seq assembly of Acanthocystis sp. from Genbank SRA accession #SRR2170625. Read quality was assessed with FastQC before and after quality trimming and SMART adaptors removal, which was performed with FastqMcf. Cleaned reads were assembled into contigs with Trinity r20140717 using default parametersAcanthocystis_sp_transcriptome.fasta.zip
Transcriptome assembly of Raphidiophrys heterophryoideaRNA-seq assembly of Raphidiophrys heterophryoidea from Genbank SRA accession #SRR2170621. Read quality was assessed with FastQC before and after quality trimming and SMART adaptors removal, which was performed with FastqMcf. Cleaned reads were assembled into contigs with Trinity r20140717 using default parametersRaphidiophrys_heterophryoidea_transcriptome.fasta.zip
Trimmed alignmentTrimmed alignment of all 250 genes. BMGE was used for trimming, following MAFFT-LINSI for automatic alignment.fasta_trimmed.zip
Untrimmed sequencesFasta files of all 250 genes containing untrimmed sequences.fasta_untrimmed.zip
Single-gene phylogenetic treesRAxML phylogenetic trees of all 250 genes.trees.zip

Item Media

Item Citations and Data

Licence

CC0 Waiver

Usage Statistics