SARS-CoV-2 is a SARS-like coronavirus of likely zoonotic origin first identified in December 2019 in Wuhan, the capital of China's Hubei province. The virus has since spread globally, resulting in the currently ongoing COVID-19 pandemic. The first whole genome sequence was published on January 5 2020, and thousands of genomes have been sequenced since this date. This resource allows unprecedented insights into the past demography of SARS-CoV-2 but also monitoring of how the virus is adapting to its novel human host, providing information to direct drug and vaccine design. We curated a dataset of 7666 public genome assemblies and analysed the emergence of genomic diversity over time. Our results are in line with previous estimates and point to all sequences sharing a common ancestor towards the end of 2019, supporting this as the period when SARS-CoV-2 jumped into its human host. Due to extensive transmission, the genetic diversity of the virus in several countries recapitulates a large fraction of its worldwide genetic diversity. We identify regions of the SARS-CoV-2 genome that have remained largely invariant to date, and others that have already accumulated diversity. By focusing on mutations which have emerged independently multiple times (homoplasies), we identify 198 filtered recurrent mutations in the SARS-CoV-2 genome. Nearly 80% of the recurrent mutations produced non-synonymous changes at the protein level, suggesting possible ongoing adaptation of SARS-CoV-2. Three sites in Orf1ab in the regions encoding Nsp6, Nsp11, Nsp13, and one in the Spike protein are characterised by a particularly large number of recurrent mutations (>15 events) which may signpost convergent evolution and are of particular interest in the context of adaptation of SARS-CoV-2 to the human host. We additionally provide an interactive user-friendly web-application to query the alignment of the 7666 SARS-CoV-2 genomes.
Understanding the process of adaptation during rapid environmental change remains one of the central focal points of evolutionary biology. The recently formed White Sands system of southern New Mexico offers an outstanding example of rapid adaptation, with a variety of species having rapidly evolved blanched forms on the dunes that contrast with their close relatives in the surrounding dark soil habitat. In this study, we focus on two of the White Sands lizard species, Sceloporus cowlesi and Aspidoscelis inornata, for which previous research has linked mutations in the melanocortin-1 receptor gene (Mc1r) to blanched coloration. We sampled populations both on and off the dunes and used a custom sequence capture assay based on probed fosmid libraries to obtain >50 kb of sequence around Mc1r and hundreds of other random genomic locations. We then used model-based statistical inference methods to describe the demographic and adaptive history characterizing the colonization of White Sands. We identified a number of similarities between the two focal species, including strong evidence of selection in the blanched populations in the Mc1r region. We also found important differences between the species, suggesting different colonization times, different genetic architecture underlying the blanched phenotype and different ages of the beneficial alleles. Finally, the beneficial allele is dominant in S. cowlesi and recessive in A. inornata, allowing for a rare empirical test of theoretically expected patterns of selective sweeps under these differing models.
Many features of virus populations make them excellent candidates for population genetic study, including a very high rate of mutation, high levels of nucleotide diversity, exceptionally large census population sizes, and frequent positive selection. However, these attributes also mean that special care must be taken in population genetic inference. For example, highly skewed offspring distributions, frequent and severe population bottleneck events associated with infection and compartmentalization, and strong purifying selection all affect the distribution of genetic variation but are often not taken into account. Here, we draw particular attention to multiple-merger coalescent events and background selection, discuss potential misinference associated with these processes, and highlight potential avenues for better incorporating them into future population genetic analyses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.