20Recent genome sequencing studies with large sample sizes in humans have discovered a vast 21 quantity of low-frequency variants, providing an important source of information to analyze how 22 selection is acting on human genetic variation. In order to estimate the strength of natural 23 selection acting on low-frequency variants, we have developed a likelihood-based method that 24 uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency 25 variants. We show that in some non-equilibrium populations (such as those that have had 26 recent population expansions) it is possible to distinguish between positive or negative selection 27 acting on a set of variants. With our new framework, one can infer a fixed selection intensity 28 acting on a set of variants at a particular frequency, or a distribution of selection coefficients for 29 standing variants and new mutations. We apply our method to the UK10K phased haplotype 30 dataset of 3,781 individuals and find a similar proportion of neutral, moderately deleterious, and 31 deleterious variants compared to previous estimates made using the site frequency spectrum.We discuss several interpretations for this result, including that selective constraints have 33 remained constant over time. 34 35 Lind et al. 2010; Jacquier et al. 2013), a unimodal distribution with a similar shape to a gamma 69 distribution (Sanjuán et al. 2004; Domingo-Calap et al. 2009; Peris et al. 2010), and a bimodal 70 distribution with one part of the probability mass on nearly neutral mutations and the other one 71 on the highly deleterious mutations (Hietpas et al. 2011). However, the data still points to a 72 bimodal DFE with mutations being either neutral or very deleterious in the majority of the studies 73 where other unimodal simpler distributions provided the best fit to the data (Sanjuán et al. 2004; 74 Domingo-Calap et al. 2009; Peris et al. 2010; Jacquier et al. 2013). This highlights that the DFE 75 might have a more complex form than the simpler probability distributions typically used to fit 76 data. In mutation-accumulation experiments, a gamma distribution is typically assumed for the 77 DFE of deleterious mutations, since there is little information to distinguish between alternative 78 distributions (Halligan & Keightley 2009). 79The other main approach is to use population genetic variation data to estimate the DFE 80 with information from the site frequency spectrum (SFS) on putatively neutral and deleterious 81 sites (Sawyer & Hartl 1992; Williamson et al. 2005; Keightley & Eyre-Walker 2007; Boyko et al. 82 2008; Gutenkunst et al. 2009; Kim et al. 2017). An interesting extension has recently been 83 developed to take SFS information and divergence data from an outgroup to infer the DFE from 84 the population where the SFS data was taken along with the rate of adaptive molecular evolution 85 based on the divergence data (Tataru et al. 2017). Two other extensions have been taken to 86 model the correlation between the fitness effects of multiple no...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.