Genetic variation is fundamental to population fitness and adaptation to environmental change. Human activities are driving declines in many wild populations and could have similar effects on genetic variation. Despite the importance of estimating such declines, no global estimate of the magnitude of ongoing genetic variation loss has been conducted across species. By combining studies that quantified recent changes in genetic variation across a mean of 27 generations for 91 species, we conservatively estimate a 5.4%–6.5% decline in within‐population genetic diversity of wild organisms since the industrial revolution. This loss has been most severe for island species, which show a 27.6% average decline. We identified taxonomic and geographical gaps in temporal studies that must be urgently addressed. Our results are consistent with single time‐point meta‐analyses, which indicated that genetic variation is likely declining. However, our results represent the first confirmation of a global decline and provide an estimate of the magnitude of the genetic variation lost from wild populations.
High-throughput sequencing is a powerful tool, but suffers biases and errors that must be accounted for to prevent false biological conclusions. Such errors include batch effects; technical errors only present in subsets of data due to procedural changes within a study. If overlooked and multiple batches of data are combined, spurious biological signals can arise, particularly if batches of data are correlated with biological variables. Batch effects can be minimized through randomization of sample groups across batches. However, in long-term or multiyear studies where data are added incrementally, full randomization is impossible, and batch effects may be a common feature. Here, we present a case study where false signals of selection were detected due to a batch effect in a multiyear study of Alpine ibex (Capra ibex). The batch effect arose because sequencing read length changed over the course of the project and populations were added incrementally to the study, resulting in nonrandom distributions of populations across read lengths. The differences in read length caused small misalignments in a subset of the data, leading to false variant alleles and thus false SNPs. Pronounced allele frequency differences between populations arose at these SNPs because of the correlation between read length and population. This created highly statistically significant, but biologically spurious, signals of selection and false associations between allele frequencies and the environment. We highlight the risk of batch effects and discuss strategies to reduce the impacts of batch effects in multiyear high-throughput sequencing studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.