Uncovering the genetic and evolutionary basis of local adaptation is a major focus of evolutionary biology. The recent development of cost-effective methods for obtaining high-quality genome-scale data makes it possible to identify some of the loci responsible for adaptive differences among populations. Two basic approaches for identifying putatively locally adaptive loci have been developed and are broadly used: one that identifies loci with unusually high genetic differentiation among populations (differentiation outlier methods) and one that searches for correlations between local population allele frequencies and local environments (genetic-environment association methods). Here, we review the promises and challenges of these genome scan methods, including correcting for the confounding influence of a species’ demographic history, biases caused by missing aspects of the genome, matching scales of environmental data with population structure, and other statistical considerations. In each case, we make suggestions for best practices for maximizing the accuracy and efficiency of genome scans to detect the underlying genetic basis of local adaptation. With attention to their current limitations, genome scan methods can be an important tool in finding the genetic basis of adaptive evolutionary change.
Understanding how and why populations evolve is of fundamental importance to molecular ecology. Restriction site-associated DNA sequencing (RADseq), a popular reduced representation method, has ushered in a new era of genome-scale research for assessing population structure, hybridization, demographic history, phylogeography and migration. RADseq has also been widely used to conduct genome scans to detect loci involved in adaptive divergence among natural populations. Here, we examine the capacity of those RADseq-based genome scan studies to detect loci involved in local adaptation. To understand what proportion of the genome is missed by RADseq studies, we developed a simple model using different numbers of RAD-tags, genome sizes and extents of linkage disequilibrium (length of haplotype blocks). Under the best-case modelling scenario, we found that RADseq using six- or eight-base pair cutting restriction enzymes would fail to sample many regions of the genome, especially for species with short linkage disequilibrium. We then surveyed recent studies that have used RADseq for genome scans and found that the median density of markers across these studies was 4.08 RAD-tag markers per megabase (one marker per 245 kb). The length of linkage disequilibrium for many species is one to three orders of magnitude less than density of the typical recent RADseq study. Thus, we conclude that genome scans based on RADseq data alone, while useful for studies of neutral genetic variation and genetic population structure, will likely miss many loci under selection in studies of local adaptation.
Understanding how and why populations evolve is of fundamental importance to molecular ecology. Restriction site-associated DNA sequencing (RADseq), a popular reduced representation method, has ushered in a new era of genome-scale research for assessing population structure, hybridization, demographic history, phylogeography and migration. RADseq has also been widely used to conduct genome scans to detect loci involved in adaptive divergence among natural Correspondence: David B. Lowry, Fax: 517-353-1926; dlowry@msu.edu. Correction noteThe original advance online paper contained two errors associated with the calculation of the median density of RAD-seq tags in the survey of recent RAD-seq genome scan studies (Table S1). The first error was in the size of the assembled stickleback genome, which was reported as 0.53 Gbp, but should have been 0.46 Gbp. The second error was an accidental inversion of terms. These two mistakes contributed to an erroneous statement in the original abstract that the median density of RAD-tags across recent studies "was one marker per 3.96 megabases." The statement has been revised to read: "was 4.08 RAD-tag markers per megabase." Following these corrections, changes were made in the abstract and elsewhere in the paper to reflect a modified interpretation of the results, though we note the main arguments in the article are unaffected. Other minor modifications to the paper were made based upon suggestions by the editors of Molecular Ecology Resources. This version of the article, published as an "accepted article" on 12 November 2016 under DOI 10.1111/1755-0998.12635 replaces the original version of the article published on 12 September 2016 under DOI 10.1111/1755-0998.12596.The idea for the manuscript was conceived collectively by all authors during an NSF National Institute for Mathematical and Biological Synthesis (NIMBioS) working group. All authors contributed to the writing of the manuscript.Supporting Information Additional Supporting Information may be found in the online version of this article: Appendix S1 Supplementary R scripts for Breaking RAD. Table S1 Recent (January 2015 to April 2016) genome scan studies, which used RAD-seq for genotyping. Andrews et al. (2016). Generally, RADseq methods produce DNA libraries for high-throughput sequencing using restriction enzymes that cut at specific motifs throughout the genome. RADseq markers come in the form of RAD-tags, which are short-read sequences adjacent to restriction enzyme cut sites. Because many polymorphic markers are produced by RADseq, it has frequently been used successfully for population genetic analyses, including assessment of population structure, hybridization, demographic history, phylogeography and migration (Catchen et al. 2013;Cavender-Bares et al. 2015;Combosch & Vollmer 2015;Qi et al. 2015). Markers generated by RADseq have also been quite useful for constructing linkage maps and identifying quantitative trait loci (QTL;Pfender et al. 2011;Houston et al. 2012;Weber et al. 2013;Laporte et al. 2015...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.