Detecting genomic footprints of selection is an important step in the understanding of evolution. Accounting for linkage disequilibrium in genome scans increases detection power, but haplotype-based methods require individual genotypes and are not applicable on pool-sequenced samples. We propose to take advantage of the local score approach to account for linkage disequilibrium in genome scans for selection, cumulating (possibly small) signals from single markers over a genomic segment, to clearly pinpoint a selection signal. Using computer simulations, we demonstrate that this approach detects selection with higher power than several state-of-the-art single-marker, windowing or haplotype-based approaches. We illustrate this on two benchmark data sets including individual genotypes, for which we obtain similar results with the local score and one haplotype-based approach. Finally, we apply the local score approach to Pool-Seq data obtained from a divergent selection experiment on behaviour in quail and obtain precise and biologically coherent selection signals: while competing methods fail to highlight any clear selection signature, our method detects several regions involving genes known to act on social responsiveness or autistic traits. Although we focus here on the detection of positive selection from multiple population data, the local score approach is general and can be applied to other genome scans for selection or other genomewide analyses such as GWAS.
Let X(1)...X(n) be a sequence of i.i.d. positive or negative integer-valued random variables and H(n) = max(0 < or = i < or = j < or = n)(X(i) +...+ X(j)) be the local score of the sequence. The exact distribution of H(n) is obtained using a simple Markov chain. This result is applied to the scoring of DNA and protein sequences in molecular biology.
We calculate the density function of U * (t), θ * (t) , where U * (t) is the maximum over [0, g(t)] of a reflected Brownian motion U , where g(t) stands for the last zero of U before t, θis the hitting time of the level U * (t), and g * (t) is the left-hand point of the interval straddling f * (t). We also calculate explicitly the marginal density functions of U * (t) and θ * (t). Let U * n and θ * n be the analogs of U * (t) and θ * (t) respectively where the underlying process (U n ) is the Lindley process, i.e. the difference between a centered real random walk and its minimum. We prove that U * n √ n
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.