UBC Research Data

Data from: Targeted sequence capture and resequencing implies a predominant role of regulatory regions in the divergence of a sympatric lake whitefish species pair (Coregonus clupeaformis) Hébert, François Olivier; Renaut, Sébastien; Bernatchez, Louis

Description

<b>Abstract</b><br/>Latest technological developments in evolutionary biology bring new challenges in documenting the intricate genetic architecture of species in the process of divergence. Sympatric populations of lake whitefish represent one of the key systems to investigate this issue. Despite the value of random genotype-by-sequencing methods and decreasing cost of sequencing technologies, it remains challenging to investigate variation in coding regions, especially in the case of recently duplicated genomes as in salmonids, as this greatly complicates whole genome resequencing. We thus designed a sequence capture array targeting 2773 annotated genes to document the nature and the extent of genomic divergence between sympatric dwarf and normal whitefish. Among the 2728 genes successfully captured, a total of 2182 coding and 10 415 noncoding putative single-nucleotide polymorphisms (SNPs) were identified after applying a first set of basic filters. A genome scan with a quality-refined selection of 2203 SNPs identified 267 outlier SNPs in 210 candidate genes located in genomic regions potentially involved in whitefish divergence and reproductive isolation. We found highly heterogeneous FST estimates among SNP loci. There was an overall low level of coding polymorphism, with a predominance of noncoding mutations among outliers. The heterogeneous patterns of divergence among loci confirm the porous nature of genomes during speciation with gene flow. Considering that few protein-coding mutations were identified as highly divergent, our results, along with previous transcriptomic studies, imply that changes in regulatory regions most likely had a greater role in the process of whitefish population divergence than protein-coding mutations. This study is the first to demonstrate the efficiency of large-scale targeted resequencing for a nonmodel species with such a large and unsequenced genome.; <b>Usage notes</b><br /><div class="o-metadata__file-usage-entry"><h4 class="o-heading__level3-file-title">Python scripts - File formatting - Sequence Capture</h4><div class="o-metadata__file-description">All relevant scripts used to format files, analyse data and obtain results from sequence capture experiment on two sympatric populations of whitefish ecotypes in the process of speciation.</div><div class="o-metadata__file-name">MEC-13-0265_seq-capt.tar.gz</br></div></div><div class="o-metadata__file-usage-entry"><h4 class="o-heading__level3-file-title">Whitefish sequence capture assembly</h4><div class="o-metadata__file-description">Fasta file containing 2,228 genes de novo assembled after conducting a targeted enrichment and re-sequencing experiment on two sympatric populations of whitefish (Coregonus clupeaformis) currently in the process of speciation.</div><div class="o-metadata__file-name">assembled_genes.fasta</br></div></div>

Item Media

Item Citations and Data

Licence

This dataset is made available under a Creative Commons CC0 license with the following additional/modified terms and conditions: CC0 Waiver