Background Cataloguing the distribution of genes within natural bacterial populations is essential for understanding evolutionary processes and the genetic basis of adaptation. Advances in whole genome sequencing technologies have led to a vast expansion in the amount of bacterial genomes deposited in public databases. There is a pressing need for software solutions which are able to cluster, catalogue and characterise genes, or other features, in increasingly large genomic datasets. Results Here we present a pangenomics toolbox, PIRATE (Pangenome Iterative Refinement and Threshold Evaluation), which identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds. PIRATE builds upon recent scalable software developments to allow for the rapid interrogation of thousands of isolates. PIRATE clusters genes (or other annotated features) over a wide range of amino acid or nucleotide identity thresholds and uses the clustering information to rapidly identify paralogous gene families and putative fission/fusion events. Furthermore, PIRATE orders the pangenome using a directed graph, provides a measure of allelic variation, and estimates sequence divergence for each gene family. Conclusions We demonstrate that PIRATE scales linearly with both number of samples and computation resources, allowing for analysis of large genomic datasets, and compares favorably to other popular tools. PIRATE provides a robust framework for analysing bacterial pangenomes, from largely clonal to panmictic species.
Cataloguing the distribution of genes within natural bacterial populations is essential for understanding evolutionary processes and the genetic basis of adaptation. Here we present a pangenomics toolbox, PIRATE (Pangenome Iterative Refinement And Threshold Evaluation), which identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds. PIRATE builds upon recent scalable software developments to allow for the rapid interrogation of thousands of isolates. PIRATE clusters genes (or other annotated features) over a wide range of amino-acid or nucleotide identity thresholds and uses the clustering information to rapidly classify paralogous gene families into either putative fission/fusion events or gene duplications. Furthermore, PIRATE orders the pangenome using a directed graph, provides a measure of allelic variation and estimates sequence divergence for each gene family. We demonstrate that PIRATE scales linearly with both number of samples and computation resources, allowing for analysis of large genomic datasets, and compares favorably to other popular tools. PIRATE provides a robust framework for analysing bacterial pangenomes, from largely clonal to panmictic species.Availability: PIRATE is implemented in Perl and is freely available under an GNU GPL 3 open source license from https://github.com/SionBayliss/PIRATE. Contact: s.bayliss@bath.ac.uk
Humans have profoundly affected the ocean environment but little is known about anthropogenic effects on the distribution of microbes. Vibrio parahaemolyticus is found in warm coastal waters and causes gastroenteritis in humans and economically significant disease in shrimps. Based on data from 1103 genomes of environmental and clinical isolates, we show that V. parahaemolyticus is divided into four diverse populations, VppUS1, VppUS2, VppX and VppAsia. The first two are largely restricted to the US and Northern Europe, while the others are found worldwide, with VppAsia making up the great majority of isolates in the seas around Asia. Patterns of diversity within and between the populations are consistent with them having arisen by progressive divergence via genetic drift during geographical isolation. However, we find that there is substantial overlap in their current distribution. These observations can be reconciled without requiring genetic barriers to exchange between populations if long-range dispersal has increased dramatically in the recent past. We found that VppAsia isolates from the US have an average of 1.01% more shared ancestry with VppUS1 and VppUS2 isolates than VppAsia isolates from Asia itself. Based on time calibrated trees of divergence within epidemic lineages, we estimate that recombination affects about 0.017% of the genome per year, implying that the genetic mixture has taken place within the last few decades. These results suggest that human activity, such as shipping, aquatic products trade and increased human migration between continents, are responsible for the change of distribution pattern of this species.
Climate change, changing farming practices, social and demographic changes and rising levels of antibiotic resistance are likely to lead to future increases in opportunistic bacterial infections that are more difficult to treat. Uncovering the prevalence and identity of pathogenic bacteria in the environment is key to assessing transmission risks. We describe the first use of the Wax moth larva Galleria mellonella, a well-established model for the mammalian innate immune system, to selectively enrich and characterize pathogens from coastal environments in the South West of the UK. Whole-genome sequencing of highly virulent isolates revealed amongst others a Proteus mirabilis strain carrying the Salmonella SGI1 genomic island not reported from the UK before and the recently described species Vibrio injenensis hitherto only reported from human patients in Korea. Our novel method has the power to detect bacterial pathogens in the environment that potentially pose a serious risk to public health.
20Background Humans have profoundly affected the ocean environment but little is known 21 about anthropogenic effects on the distribution of microbes. Vibrio parahaemolyticus is found 22 in warm coastal waters and causes gastroenteritis in humans and economically significant 23 disease in shrimps. 24Results Based on data from 1,103 genomes, we show that V. parahaemolyticus is divided 25 into four diverse populations, VppUS1, VppUS2, VppX and VppAsia. The first two are 26 largely restricted to the US and Northern Europe, while the others are found worldwide, with 27 VppAsia making up the great majority of isolates in the seas around Asia. Patterns of 28 diversity within and between the populations are consistent with them having arisen by 29 progressive divergence via genetic drift during geographical isolation. However, we find that 30 there is substantial overlap in their current distribution. These observations can be reconciled 31 without requiring genetic barriers to exchange between populations if dispersal between 32 oceans has increased dramatically in the recent past. We found that VppAsia isolates from the 33 US have an average of 1.01% more shared ancestry with VppUS1 and VppUS2 isolates than 34 VppAsia isolates from Asia itself. Based on time calibrated trees of divergence within 35 epidemic lineages, we estimate that recombination affects about 0.017% of the genome per 36year, implying that the genetic mixture has taken place within the last few decades. 37Conclusions These results suggest that human activity, such as shipping and aquatic products 38 trade, are responsible for the change of distribution pattern of this marine species. 39 40 Keywords 41 Vibrio parahaemolyticus, population structure, biogeography, anthropogenic change, ocean 42 dispersal 43 44 3 45 Background 46 Hospitable environments for particular marine microbes can be separated by large distances 47 but whether dispersal barriers substantially influence their distribution and evolution is 48 unknown. There are many studies of distribution of marine microbes e.g. [1-4], but these 49 typically survey patterns of macro-scale diversity. Differences in species level or genus level 50 composition between locations are as likely to reflect environmental heterogeneity as 51 dispersal, making the patterns difficult to interpret. Recent spread of microbes between 52 continents has been documented for lineages that cause pathogenic infection of humans, 53 including notorious clonal groups within Vibrio parahaemolyticus and Vibrio cholerae [5-8]. 54 However, these lineages are unusual in using humans as vectors, which might facilitate long-55 range dispersal as in the case of the Haitian cholera outbreak [9]. We currently have little 56 information on rates of spread of the great majority of environmental organisms that do not 57 colonize large-animal hosts. 58 59 V. parahaemolyticus prefers warm coastal waters and causes gastroenteritis in humans [10, 60 11]. Disease outbreaks became common from 1990s and became global, due to spread of 61 pa...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.