Simple satellites are tandemly repeating short DNA motifs that can span megabases in eukaryotic genomes. Because they can cause genomic instability through nonallelic homologous exchange, they are primarily found in the repressive heterochromatin near centromeres and telomeres where recombination is minimal, and on the Y chromosome, where they accumulate as the chromosome degenerates. Interestingly, the types and abundances of simple satellites often vary dramatically between closely related species, suggesting that they turn over rapidly. However, limited sampling has prevented detailed understanding of their evolutionary dynamics. Here, we characterize simple satellites from whole-genome sequences generated from males and females of nine Drosophila species, spanning 40 Ma of evolution. We show that PCR-free library preparation and postsequencing GC-correction better capture satellite quantities than conventional methods. We find that over half of the 207 simple satellites identified are species-specific, consistent with previous descriptions of their rapid evolution. Based on a maximum parsimony framework, we determined that most interspecific differences are due to lineage-specific gains. Simple satellites gained within a species are typically a single mutation away from abundant existing satellites, suggesting that they likely emerge from existing satellites, especially in the genomes of satellite-rich species. Interestingly, unlike most of the other lineages which experience various degrees of gains, the lineage leading up to the satellite-poor D. pseudoobscura and D. persimilis appears to be recalcitrant to gains, providing a counterpoint to the notion that simple satellites are universally rapidly evolving.
A long-standing evolutionary puzzle is that all eukaryotic genomes contain large amounts of tandemly-repeated DNA whose sequence motifs and abundance vary greatly among even closely related species. To elucidate the evolutionary forces governing tandem repeat dynamics, quantification of the rates and patterns of mutations in repeat copy number and tests of its selective neutrality are necessary. Here, we used whole-genome sequences of 28 mutation accumulation (MA) lines of , in addition to six isolates from a non-MA population originating from the same progenitor, to both estimate mutation rates of abundances of repeat sequences and evaluate the selective regime acting upon them. We found that mutation rates of individual repeats were both high and highly variable, ranging from additions/deletions of 0.29-105 copies per generation (reflecting changes of 0.12-0.80% per generation). Our results also provide evidence that new repeat sequences are often formed from existing ones. The non-MA population isolates showed a signal of either purifying or stabilizing selection, with 33% lower variation in repeat copy number on average than the MA lines, although the level of selective constraint was not evenly distributed across all repeats. The changes between many pairs of repeats were correlated, and the pattern of correlations was significantly different between the MA lines and the non-MA population. Our study demonstrates that tandem repeats can experience extremely rapid evolution in copy number, which can lead to high levels of divergence in genome-wide repeat composition between closely related species.
Resistance to insecticides has evolved in multiple insect species, leading to increased application rates and even control failures. Understanding the genetic basis of insecticide resistance is fundamental for mitigating its impact on crop production and disease control. We performed a GWAS approach with the Drosophila Genetic Reference Panel (DGRP) to identify the mutations involved in resistance to two widely used classes of insecticides: organophosphates (OPs, parathion) and pyrethroids (deltamethrin). Most variation in parathion resistance was associated with mutations in the target gene Ace, while most variation in deltamethrin resistance was associated with mutations in Cyp6a23, a gene encoding a detoxification enzyme never previously associated with resistance. A “nested GWAS” further revealed the contribution of other loci: Dscam1 and trpl were implicated in resistance to parathion, but only in lines lacking Wolbachia. Cyp6a17, the paralogous gene of Cyp6a23, and CG7627, an ATP-binding cassette transporter, were implicated in deltamethrin resistance. We observed signatures of recent selective sweeps at all of these resistance loci and confirmed that the soft sweep at Ace is indeed driven by the identified resistance mutations. Analysis of allele frequencies in additional population samples revealed that most resistance mutations are segregating across the globe, but that frequencies can vary substantially among populations. Altogether, our data reveal that the widely used OP and pyrethroid insecticides imposed a strong selection pressure on natural insect populations. However, it remains unclear why, in Drosophila, resistance evolved due to changes in the target site for OPs, but due to a detoxification enzyme for pyrethroids.
Bayesian estimates of divergence times based on the molecular clock yield uncertainty of parameter estimates measured by the width of posterior distributions of node ages. For the relaxed molecular clock, previous works have reported that some of the uncertainty inherent to the variation of rates among lineages may be reduced by partitioning data. Here we test this effect for the purely morphological clock, using placental mammals as a case study. We applied the uncorrelated lognormal relaxed clock to morphological data of 40 extant mammalian taxa and 4,533 characters, taken from the largest published matrix of discrete phenotypic characters. The morphologically derived timescale was compared to divergence times inferred from molecular and combined data. We show that partitioning data into anatomical units significantly reduced the uncertainty of divergence time estimates for morphological data. For the first time, we demonstrate that ascertainment bias has an impact on the precision of morphological clock estimates. While analyses including molecular data suggested most divergences between placental orders occurred near the K‐Pg boundary, the partitioned morphological clock recovered older interordinal splits and some younger intraordinal ones, including significantly later dates for the radiation of bats and rodents, which accord to the short‐fuse hypothesis.
A selective sweep occurs when positive selection drives an initially rare allele to high population frequency. In nature, the precise parameters of a sweep are seldom known: How strong was positive selection? Did the sweep involve only a single adaptive allele (hard sweep) or were multiple adaptive alleles at the locus sweeping at the same time (soft sweep)? If the sweep was soft, did these alleles originate from recurrent new mutations (RNM) or from standing genetic variation (SGV)? Here, we present a method based on supervised machine learning to infer such parameters from the patterns of genetic variation observed around a given sweep locus. Our method is trained on sweep data simulated with SLiM, a fast and flexible framework that allows us to generate training data across a wide spectrum of evolutionary scenarios and can be tailored towards the specific population of interest. Inferences are based on summary statistics describing patterns of nucleotide diversity, haplotype structure, and linkage disequilibrium, which are estimated across systematically varying genomic window sizes to capture sweeps across a wide range of selection strengths. We show that our method can accurately infer selection coefficients in the range 0.01 < s < 100 and classify sweep types between hard sweeps, RNM soft sweeps, and SGV soft sweeps with accuracy 69 % to 95 % depending on sweep strength. We also show that the method infers the correct sweep types at three empirical loci known to be associated with the recent evolution of pesticide resistance in Drosophila melanogaster. Our study demonstrates the power of machine learning for inferring sweep parameters from present-day genotyping samples, opening the door to a better understanding of the modes of adaptive evolution in nature.Author summaryAdaptation often involves the rapid spread of a beneficial genetic variant through the population in a process called a selective sweep. Here, we develop a method based on machine learning that can infer the strength of selection driving such a sweep, and distinguish whether it involved only a single adaptive variant (a so-called hard sweep) or several adaptive variants of independent origin that were simultaneously rising in frequency at the same genomic position (a so-called soft selective sweep). Our machine learning method is trained on simulated data and only requires data sampled from a single population at a single point in time. To address the challenge of simulating realistic datasets for training, we explore the behavior of the method under a variety of testing scenarios, including scenarios where the history of the population of interest was misspecified. Finally, to illustrate the accuracy of our method, we apply it to three known sweep loci that have contributed to the evolution of pesticide resistance in Drosophila melanogaster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.