We identify a new natural coalescent structure, which we call the seed-bank coalescent, that describes the gene genealogy of populations under the influence of a strong seed-bank effect, where "dormant forms" of individuals (such as seeds or spores) may jump a significant number of generations before joining the "active" population. Mathematically, our seed-bank coalescent appears as scaling limit in a Wright-Fisher model with geometric seed-bank age structure if the average time of seed dormancy scales with the order of the total population size N . This extends earlier results of Kaj, Krone and Lascoux [J. Appl. Probab. 38 (2011) 285-300] who show that the genealogy of a Wright-Fisher model in the presence of a "weak" seed-bank effect is given by a suitably time-changed Kingman coalescent. The qualitatively new feature of the seed-bank coalescent is that ancestral lineages are independently blocked at a certain rate from taking part in coalescence events, thus strongly altering the predictions of classical coalescent models. In particular, the seed-bank coalescent "does not come down from infinity," and the time to the most recent common ancestor of a sample of size n grows like log log n. This is in line with the empirical observation that seed-banks drastically increase genetic variability in a population and indicates how they may serve as a buffer against other evolutionary forces such as genetic drift and selection.
We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of "dormant" lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright-Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.KEYWORDS Wright-Fisher model; seedbank coalescent; dormancy; site-frequency spectrum; distance statistics M ANY microorganisms can enter reversible dormant states of low [respectively (resp.) zero] metabolic activity, for example when faced with unfavorable environmental conditions; see, e.g., Lennon and Jones (2011) for a recent overview of this phenomenon. Such dormant forms may stay inactive for extended periods of time and thus create a seedbank that should significantly affect the interplay of evolutionary forces driving the genetic variability of the microbial population. In fact, in many ecosystems, the percentage of dormant cells compared to the total population size is substantial and sometimes even dominant (for example, $20% in human gut, 40% in marine water, and 80% in soil; cf. Lennon and Jones 2011, box 1, table a). This abundance of dormant forms, which can be short-lived as well as stay inactive for significant periods of time (decades-or century-old spores are not uncommon), thus creates a seedbank that buffers against environmental change, but potentially also against classical evolutionary forces such as genetic drift, mutation, and selection.In this article, we investigate the effect of large seedbanks (that is, comparable to the size of the active population) on the patterns of genetic variability in populations over macroscopic timescales. In particular, we extend a recently introduced mathematical ...
Across the tree of life, populations have evolved the capacity to contend with suboptimal conditions by engaging in dormancy, whereby individuals enter a reversible state of reduced metabolic activity. The resulting seed banks are complex, storing information and imparting memory that gives rise to multi-scale structures and networks spanning collections of cells to entire ecosystems. We outline the fundamental attributes and emergent phenomena associated with dormancy and seed banks, with the vision for a unifying and mathematically based framework that can address problems in the life sciences, ranging from global change to cancer biology.
We study the effect of biological confounders on the model selection problem between Kingman coalescents with population growth, and Ξ-coalescents involving simultaneous multiple mergers. We use a low dimensional, computationally tractable summary statistic, dubbed the singleton-tail statistic, to carry out approximate likelihood ratio tests between these model classes. The singleton-tail statistic has been shown to distinguish between them with high power in the simple setting of neutrally evolving, panmictic populations without recombination. We extend this work by showing that cryptic recombination and selection do not diminish the power of the test, but that misspecifying population structure does. Furthermore, we demonstrate that the singleton-tail statistic can also solve the more challenging model selection problem between multiple mergers due to selective sweeps, and multiple mergers due to high fecundity with moderate power of up to 30%.
We derive statistical tools to analyze the patterns of genetic variability produced by models related to seed banks; in particular the Kingman coalescent, its time-changed counterpart describing so-called weak seed banks, the strong seed bank coalescent, and the two-island structured coalescent. As (strong) seed banks stratify a population, we expect them to produce a signal comparable to population structure. We present tractable formulas for Wright's F ST and the expected site frequency spectrum for these models, and show that they can distinguish between some models for certain ranges of parameters. We then use pseudo-marginal MCMC to show that the full likelihood can reliably distinguish between all models in the presence of parameter uncertainty. It is also possible to infer parameters, and in particular determine whether mutation is taking place in the (strong) seed bank. Population modelsKingman's coalescent (K): The standard model of genetic ancestry in the absence of a seed bank is the coalescent (or Kingman's coalescent) [13], which describes ancestries of samples of size n P N from a large, selectively neutral, panmictic population of size N " n following e.g. a Wright-Fisher model. Measuring time in units of N and tracing the ancestry of a sample of size n ! N backwards in time results in a coalescent process Π n in which each pair of lineages merges to a common ancestor independently at rate 1 as N Ñ 8. A rooted ancestral tree is formed once the most recent common ancestor of the whole sample is reached. We denote this scenario by K. This model is currently the standard null model in population genetics (see e.g. [14] for an introduction) and arises from a large class of population models.'Weak' seed banks and the delayed coalescent (W): The coalescent was extended in [3] to incorporate a 'weak' seed bank. In this model, an individual inherits its genetic material from a parent that was alive a random number of generations ago. The random separation is assumed to have mean β´1 for some β P p0, 1s. Measuring time in units of N and tracing the ancestry of a sample of size n ! N as above, it can be shown that the genealogy is still given by a coalescent in which each pair of lineages merges to a common ancestor independently with rate β 2 . Thus, the effect of the seed bank is to stretch the branches of the Kingman coalescent by a constant factor [3,15], but the topology and relative branch lengths remain identical to those of the coalescent. Thus the weak seed bank coalescent with mean separation β´1 and population-rescaled mutation rate u ą 0 is statistically identical to Kingman's coalescent with populationrescaled mutation rate u{β 2 , and e.g. the normalized site frequency spectrum under the infinitely many sites model is invariant between these models [5]. We call the corresponding coalescent a 'delayed coalescent' and denote this scenario by W. Nevertheless, the seed bank does have important consequences e.g. for the estimation of effective population size and mutation rates in the presence of pr...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.