Runxi Shen scite author profile

2022

Bulk segregant analysis is a technique for identifying the genetic loci that underlie phenotypic trait differences. The basic approach is to compare two pools of individuals from the opposing tails of the phenotypic distribution, sampled from an interbred population. Each pool is sequenced and scanned for alleles that show divergent frequencies between the pools, indicating potential association with the observed trait differences. Bulk segregant analysis has already been successfully applied to the mapping of various quantitative trait loci in organisms ranging from yeast to maize. However, these studies have typically suffered from rather low mapping resolution, and we still lack a detailed understanding of how this resolution is affected by experimental parameters. Here, we use coalescence theory to calculate the expected genomic resolution of bulk segregant analysis for a simple monogenic trait. We first show that in an idealized interbreeding population of infinite size, the expected length of the mapped region is inversely proportional to the recombination rate, the number of generations of interbreeding, and the number of genomes sampled, as intuitively expected. In a finite population, coalescence events in the genealogy of the sample reduce the number of potentially informative recombination events during interbreeding, thereby increasing the length of the mapped region. This is incorporated into our model by an effective population size parameter that specifies the pairwise coalescence rate of the interbreeding population. The mapping resolution predicted by our calculations closely matches numerical simulations and is surprisingly robust to moderate levels of contamination of the segregant pools with alternative alleles. Furthermore, we show that the approach can easily be extended to modifications of the crossing scheme. Our framework will allow researchers to predict the expected power of their mapping experiments, and to evaluate how their experimental design could be tuned to optimize mapping resolution.

A model of functionally buffered deleterious mutations can lead to signatures of positive selection

Wenzel

et al. 2022

Preprint

Selective pressures on DNA sequences often result in signatures of departures from neutral evolution that can be captured by the McDonald-Kreitman (MK) test. However, the nature of such selective forces mostly remains unknown to the experimentalists. Here we use the bag of marbles (bam) gene in Drosophila to investigate different types of driving forces behind positive selection. We examine two evolutionary models for bam. The Conflict model originates from a conflict of fitness between Drosophila and Wolbachia that causes reciprocal adaptations in each, resulting in diversifying selection on the bam protein. In the alternative Buffering model, Wolbachia protects bam from deleterious mutations during an infection and thereby allows such mutations to accumulate and even fix in the population. If Wolbachia is subsequently lost from the species, mutations that revert the gene back towards its original biological function become advantageous. We use simulations to show that both models produce signals of positive selection, though the levels of positive selection under the Conflict model are more easily detected by the MK test. By fitting the two models to the empirical divergence of D. melanogaster from an inferred ancestral sequence, we found that the Conflict model reproduced strong signals of positive selection like those observed empirically, while the Buffering model better recapitulated the physicochemical signatures of the amino acid sequence evolution at bam. Our demonstration that the Buffering model can lead to positive selection suggests a novel mechanism that needs to be considered behind observed signals of positive selection on protein coding genes.

Predicting the Genomic Resolution of Bulk Segregant Analysis

2021

Preprint

Bulk segregant analysis (BSA) is a technique for identifying the genetic loci that underlie phenotypic trait differences. The basic approach of this method is to compare two pools of individuals from the opposing tails of the phenotypic distribution, sampled from an interbred population. Each pool is sequenced and scanned for alleles that show divergent frequencies between the pools, indicating potential association with the observed trait differences. BSA has already been successfully applied to the mapping of various quantitative trait loci in organisms ranging from yeast to maize. However, these studies have typically suffered from rather low mapping resolution, and we still lack a detailed understanding of how this resolution is affected by experimental parameters. Here, we use coalescence theory to calculate the expected genomic resolution of BSA. We first show that in an idealized interbreeding population of infinite size, the expected length of the mapped region is inversely proportional to the recombination rate, the number of generations of interbreeding, and the number of genomes sampled, as intuitively expected. In a finite population, coalescence events in the genealogy of the sample reduce the number of potentially informative recombination events during interbreeding, thereby increasing the length of the mapped region. This is incorporated into our theory by an effective population size parameter that specifies the pairwise coalescence rate of the interbreeding population. The mapping resolution predicted by our theory closely matches numerical simulations. Furthermore, we show that the approach can easily be extended to modifications of the crossing scheme. Our framework enables researchers to predict the expected power of their mapping experiments, and to evaluate how their experimental design could be tuned to optimize mapping resolution.

Evolution under a model of functionally buffered deleterious mutations can lead to positive selection in protein-coding genes

Wenzel

et al. 2023

Selective pressures on DNA sequences often result in departures from neutral evolution that can be captured by the McDonald-Kreitman (MK) test. However, the nature of such selective forces often remains unknown to experimentalists. Amino acid fixations driven by natural selection in protein coding genes are commonly associated with a genetic arms race or changing biological purposes, leading to proteins with new functionality. Here, we evaluate the expectations of population genetic patterns under a buffering mechanism driving selective amino acids to fixation, which is motivated by an observed phenotypic rescue of otherwise deleterious nonsynonymous substitutions at bag of marbles (bam) and Sex lethal (Sxl) in Drosophila melanogaster. These two genes were shown to experience strong episodic bursts of natural selection potentially due to infections of the endosymbiotic bacteria Wolbachia observed among multiple Drosophila species. Using simulations to implement and evaluate the evolutionary dynamics of a Wolbachia buffering model, we demonstrate that selectively fixed amino acid replacements will occur, but that the proportion of adaptive amino acid fixations and the statistical power of the MK test to detect the departure from an equilibrium neutral model are both significantly lower than seen for an arms race/change-in-function model that favors proteins with diversified amino acids. We find that the observed selection pattern at bam in a natural population of D. melanogaster is more consistent with an arms race model than with the buffering model.