Identification of genomic regions that are identical by descent (IBD) has proven useful for human genetic studies where analyses have led to the discovery of familial relatedness and fine-mapping of disease critical regions. Unfortunately however, IBD analyses have been underutilized in analysis of other organisms, including human pathogens. This is in part due to the lack of statistical methodologies for non-diploid genomes in addition to the added complexity of multiclonal infections. As such, we have developed an IBD methodology, called isoRelate, for analysis of haploid recombining microorganisms in the presence of multiclonal infections. Using the inferred IBD status at genomic locations, we have also developed a novel statistic for identifying loci under positive selection and propose relatedness networks as a means of exploring shared haplotypes within populations. We evaluate the performance of our methodologies for detecting IBD and selection, including comparisons with existing tools, then perform an exploratory analysis of whole genome sequencing data from a global Plasmodium falciparum dataset of more than 2500 genomes. This analysis identifies Southeast Asia as having many highly related isolates, possibly as a result of both reduced transmission from intensified control efforts and population bottlenecks following the emergence of antimalarial drug resistance. Many signals of selection are also identified, most of which overlap genes that are known to be associated with drug resistance, in addition to two novel signals observed in multiple countries that have yet to be explored in detail. Additionally, we investigate relatedness networks over the selected loci and determine that one of these sweeps has spread between continents while the other has arisen independently in different countries. IBD analysis of microorganisms using isoRelate can be used for exploring population structure, positive selection and haplotype distributions, and will be a valuable tool for monitoring disease control and elimination efforts of many diseases.
Understanding how malaria parasites gain entry into human red blood cells is essential for developing strategies to stop blood stage infection. Plasmodium vivax preferentially invades reticulocytes, which are immature red blood cells. The organism has two erythrocyte-binding protein families: namely, the Duffy-binding protein (PvDBP) and the reticulocyte-binding protein (PvRBP) families. Several members of the PvRBP family bind reticulocytes, specifically suggesting a role in mediating host cell selectivity of P. vivax. Here, we present, to our knowledge, the first high-resolution crystal structure of an erythrocyte-binding domain from PvRBP2a, solved at 2.12 Å resolution. The monomeric molecule consists of 10 α-helices and one short β-hairpin, and, although the structural fold is similar to that of PfRh5-the essential invasion ligand in Plasmodium falciparum-its surface properties are distinct and provide a possible mechanism for recognition of alternate receptors. Sequence alignments of the crystallized fragment of PvRBP2a with other PvRBPs highlight the conserved placement of disulfide bonds. PvRBP2a binds mature red blood cells through recognition of an erythrocyte receptor that is neuraminidase-and chymotrypsinresistant but trypsin-sensitive. By examining the patterns of sequence diversity within field isolates, we have identified and mapped polymorphic residues to the PvRBP2a structure. Using mutagenesis, we have also defined the critical residues required for erythrocyte binding. Characterization of the structural features that govern functional erythrocyte binding for the PvRBP family provides a framework for generating new tools that block P. vivax blood stage infection.parasite invasion | X-ray crystallography | SAXS | reticulocyte binding protein | malaria T he most widely distributed recurring malaria infections globally are caused by Plasmodium vivax, which accounts for 80-100 million malaria infections per year (1). The majority of clinical symptoms associated with malaria are due to blood stage infection (2). The merozoite forms of malaria parasites invade human erythrocytes through a multistep process that involves initial contact with the red blood cell, apical reorientation of the merozoite, and the formation of a tight junction that moves progressively toward the posterior end of the parasite until host cell membrane fusion is completed. These steps in invasion are dependent on specific interactions between parasite adhesins and their cognate erythrocyte receptors (reviewed in ref.3).P. vivax preferentially invades reticulocytes: i.e., immature red blood cells (4). The basis of host cell selectivity by merozoites from Plasmodium spp. seems to be mediated primarily by families of adhesin proteins. The two erythrocyte-binding protein families of P. vivax are called the Duffy-binding protein (PvDBP) and reticulocyte-binding protein (PvRBP) families (5). In laboratory-adapted P. vivax strains, there is only one PvDBP protein in P. vivax that binds to Duffy antigen receptor for chemokines (DARC) (6...
The Bioconductor project provides many interoperable data abstractions for analyzing high-throughput genomics experiments; however implementing a typical genomic workflow with Bioconductor requires learning these abstractions and understanding them at an integrative level. This places a large cognitive burden on the user, especially for non-programmers. To reduce this burden we have created a grammar of genomic data transformation that operates on a single, central Bioconductor data structure, GRanges, which naturally represents genomic intervals and their associated measurements. The grammar defines verbs for performing actions on and between genomic interval data through a simplified, coherent interface to existing Bioconductor infrastructure, resulting in fluent analysis workflows. We have implemented this grammar as an R/Bioconductor package called plyranges.
Deriving biological insights from genomic data commonly requires comparing attributes of selected genomic loci to a null set of loci. The selection of this null set is non trivial, as it requires careful consideration of potential covariates, a problem that is exacerbated by the non-uniform distribution of genomic features including genes, enhancers, and transcription factor binding sites. Propensity score-based covariate matching methods allow selection of null sets from a pool of possible items while controlling for multiple covariates; however, existing packages do not operate on genomic data classes and can be slow for large data sets making them difficult to integrate into genomic workflows. To address this, we developed matchRanges, a propensity score-based covariate matching method for the efficient and convenient generation of matched null ranges from a set of background ranges within the Bioconductor framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.