Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
Motivation: Coalescent-based simulation software for genomic sequences allows the efficient in silico generation of short- and medium-sized genetic sequences. However, the simulation of genome-size datasets as produced by next-generation sequencing is currently only possible using fairly crude approximations.Results: We present the sequential coalescent with recombination model (SCRM), a new method that efficiently and accurately approximates the coalescent with recombination, closing the gap between current approximations and the exact model. We present an efficient implementation and show that it can simulate genomic-scale datasets with an essentially correct linkage structure.Availability and implementation: The open source implementation scrm is freely available at https://scrm.github.io under the conditions of the GPLv3 license.Contact: staab@bio.lmu.de or gerton.lunter@well.ox.ac.uk.Supplementary information: Supplementary data are available at Bioinformatics online.
The Horn of Africa harbors the largest reservoir of Plasmodium vivax in the continent. Most of sub-Saharan Africa has remained relatively vivax-free due to a high prevalence of the human Duffy-negative trait, but the emergence of strains able to invade Duffy-negative reticulocytes poses a major public health threat. We undertook the first population genomic investigation of P. vivax from the region, comparing the genomes of 24 Ethiopian isolates against data from Southeast Asia to identify important local adaptions. The prevalence of the Duffy binding protein amplification in Ethiopia was 79%, potentially reflecting adaptation to Duffy negativity. There was also evidence of selection in a region upstream of the chloroquine resistance transporter, a putative chloroquine-resistance determinant. Strong signals of selection were observed in genes involved in immune evasion and regulation of gene expression, highlighting the need for a multifaceted intervention approach to combat P. vivax in the region.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.