Detecting genomic regions under selection is an important objective of population genetics.Typical analyses for this goal are based on exploiting genetic diversity patterns in present time data but rapid advances in DNA sequencing have increased the availability of time series genomic data. A common approach to analyze such data is to model the temporal evolution of an allele frequency as a Markov chain. Based on this principle, several methods have been proposed to infer selection intensity. One of their differences lies in how they model the transition probabilities of the Markoiv chain. Using the Wright-Fisher model is a natural choice but its computational cost is prohibitive for large population sizes so approximations to this model based on parametric distributions have been proposed. Here, we compared the performance of some of these approximations with respect to their power to detect selection and estimation of the selection coefficient. We developped a new generic Hidden Markov Model likelihood calculator and applied it on genetic time series simulated under various evolutionary scenarios. The Beta-with-Spikes approximation, which combines discrete fixation probabilities with a continuous Beta distribution, was found to perform consistently better than the others. This distribution provides an almost perfect fit to the Wright-Fisher model in terms of selection inference, for a computational cost that does not increase with population size. We further evaluate this model for population sizes not accessible to the Wright-Fisher model and illustrate its performance on a dataset of two divergently selected chicken populations.Under the null, λ asymptotically follows a χ 2 with one degree of freedom, a property that allows to 107 derive p-values which are useful quantities in particular in a multiple testing context.
108Having described the general framework used in this study, we now turn to its implementation that 109 requires specifying the transition kernel Q k (x, .). One objective of this work was to investigate differ-110 ent models for this transition kernel. A reference model in this context is the one-locus Wright-Fisher 111 (WF) model under diploid selection pressure (Ewens, 2004). However, this model has computational 112 limitations and, in the HMM famework, must generally be approximated. We first present the WF 113 model and then several continuous approximations of this model based on a common approach: the 114 method-of-moments.
115
Wright-Fisher transition model 116The WF model is a discrete time model that assumes random mating and non overlapping generations.
117Let N e be the haploid (constant) effective population size. Under this model, X (t) can take a discrete 118 number of values in {0, 1 Ne , . . . , Ne−1 Ne , 1} and the transition kernel of this process is a transition matrix 119 130 very costly for large N e and large intersample time. To overcome this issue, one approach is to ap-131 proximate the Wright-Fisher process with a continuous space process such that integral calculati...