Technological advances in DNA sequencing over the last decade now permit the production and curation of large genomic data sets in an increasing number of nonmodel species. Additionally, these new data provide the opportunity for combining data sets, resulting in larger studies with a broader taxonomic range. Whilst the development of new sequencing platforms has been beneficial, resulting in a higher throughput of data at a lower per‐base cost, shifts in sequencing technology can also pose challenges for those wishing to combine new sequencing data with data sequenced on older platforms. Here, we outline the types of studies where the use of curated data might be beneficial, and highlight potential biases that might be introduced by combining data from different sequencing platforms. As an example of the challenges associated with combining data across sequencing platforms, we focus on the impact of the shift in Illumina's base calling technology from a four‐channel system to a two‐channel system. We caution that when data are combined from these two systems, erroneous guanine base calls that result from the two‐channel chemistry can make their way through a bioinformatic pipeline, eventually leading to inaccurate and potentially misleading conclusions. We also suggest solutions for dealing with such potential artefacts, which make samples sequenced on different sequencing platforms appear more differentiated from one another than they really are. Finally, we stress the importance of archiving tissue samples and the associated sequences for the continued reproducibility and reusability of sequencing data in the face of ever‐changing sequencing platform technology.
Effective conservation actions to counteract the current decline of populations and species require a deep knowledge on their genetic structure. We used Single Nucleotide Polymorphisms (SNPs) to infer the population structure of the highly threatened freshwater pearl mussel Margaritifera margaritifera in the Iberian Peninsula. A total of 130 individuals were collected from 26 locations belonging to 16 basins. We obtained 31,692 SNPs through Genotyping by Sequencing (GBS) and used this dataset to infer population structure. Genetic diversity given as observed heterozygosity was low. Pairwise FST comparisons revealed low levels of genetic differentiation among geographically close populations. Up to 3 major genetic lineages were determined: Atlantic, Cantabrian and Douro. This structure suggests a close co-evolutionary process with brown trout (Salmo trutta), the primordial fish host of this mussel in the studied area. Some sub-basins showed some genetic structuring, whereas in others no intrapopulation differentiation was found. Our results confirm that genetic conservation units do not match individual basins, and that knowledge about the genetic structure is necessary before planning recovery plans that may involve relocation or restocking. The same reasoning should be applied to strictly freshwater species that are sessile or have restricted dispersal abilities and are currently imperiled worldwide.
11In freshwater fish, processes of population divergence and speciation are often linked 12 to the geomorphology of rivers and lakes that create barriers isolating populations. 13However, current geographical isolation does not necessarily imply total absence of 14 gene flow during the divergence process. Here, we focused on four species of the 15 genus Squalius in Portuguese rivers: S. carolitertii, S. pyrenaicus, S. aradensis and S. 16 torgalensis. Previous studies based on eight nuclear and mitochondrial markers 17 revealed incongruent patterns, with nuclear loci suggesting that S. pyrenaicus was a 18 paraphyletic group, since its northern populations were genetically closer to S. 19 carolitertii than to other southern populations. Here, for the first time, we successfully 20 applied a genomic approach to the study of the relationship between these species, 21 using a Genotyping by Sequencing approach to obtain single nucleotide 22 polymorphisms (SNPs). Our results revealed a species tree with two main lineages: (i) 23 S. carolitertii and S. pyrenaicus; (ii) S. torgalensis and S. aradensis. Moreover, 24 regarding S. carolitertii and S. pyrenaicus, we found evidence for past introgression 25 between these two species in the northern part of S. pyrenaicus distribution. This 26 introgression reconciles previous mitochondrial and nuclear incongruent results and 27 explains the apparent paraphyly of S. pyrenaicus. Although we cannot distinguish a 28 scenario of hybrid speciation from secondary contact, our estimates are consistent 29 across models, suggesting that the northern populations of S. pyrenaicus received 30 approximately 80% from S. carolitertii and 20% from southern S. pyrenaicus. This 31 illustrates that even in freshwater species currently found in isolated river drainages, 32we are able to detect past gene flow events in present-day genomes, suggesting that 33 speciation is more complex than simply allopatric. 34 35 3
This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.