Data from: Assessing the potential of genotyping-by-sequencing-derived single nucleotide polymorphisms to identify the geographic origins of intercepted gypsy moth (Lymantria dispar) specimens: a proof-of-concept study Picq, Sandrine; Keena, Melody; Havill, Nathan; Stewart, Don; Pouliot, Esther; Boyle, Brian; Levesque, Roger C.; Hamelin, Richard C.; Cusson, Michel
Forest invasive alien species are a major threat to ecosystem stability and can have enormous economic and social impacts. For this reason, preventing the introduction of Asian gypsy moths (AGM; Lymantria dispar asiatica and L. d. japonica) into North America has been identified as a top priority by North American authorities. The AGM is an important defoliator of a wide variety of hardwood and coniferous trees, displaying a much broader host range and an enhanced dispersal ability relative to the already established European gypsy moth (L. d. dispar). Although molecular assays have been developed to help distinguish gypsy moth subspecies, these tools are not adequate for tracing the geographic origins of AGM samples intercepted on foreign vessels. Yet, this type of information would be very useful in characterizing introduction pathways and would help North American regulatory authorities in preventing introductions. The present proof-of-concept study assessed the potential of single nucleotide polymorphism (SNP) markers, obtained through genotyping-by-sequencing (GBS), to identify the geographic origins of gypsy moth samples. The approach was applied to eight laboratory-reared gypsy moth populations, whose original stocks came from locations distributed over the entire range of L. dispar, comprising representatives of the three recognized subspecies. The various analyses we performed showed strong differentiation among populations (Fst ≥ 0.237), enabling clear distinction of subspecies and geographic variants, while revealing introgression near the geographic boundaries between subspecies. This strong population structure resulted in 100% assignment success of moths to their original population when 2327 SNPs were used. Although the SNP panels we developed are not immediately applicable to contemporary, natural populations because of distorted allele frequencies in the laboratory-reared populations we used, our results attest to the potential of genome-wide SNP markers as a tool to identify the geographic origins of intercepted gypsy moth samples.; Usage notes
Picq_et_al_2017_raw_and_filtered_SNP_dataThe file contains all of the SNP data used in our article. The files named HapMap.fas.txt, HapMap.hmc.txt and HapMap.hmp.txt were obtained with the de novo UNEAK pipeline and respectively contains sequence of the SNP tag/read, the tag/read counts of each allele of the SNPs in each individual and eventually the genotypes. These files were obtained with the following default parameter settings: minimum tag count c = 5, error tolerance rate in the network filter e = 0.03 and minimum minor allele frequency mnMAF = 0.05. The last file named LymantriaD_Picq_et_al_80%_coverage_MafHeHw_LD_20170920.vcf present the 2327 SNPs obtained after the different SNP filtering steps (see Table 2 in our article). The 133 outlier SNPs are present in this file.
Item Citations and Data