Summary: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors.Availability and implementation: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/RoaryContact: roary@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Author Contributions HG and DMA conceived the study. The EuSCAPE working group collected the bacterial isolates and epidemiological data, and performed preliminary laboratory analyses. The ESGEM facilitated the training and capacity building for the collection of bacterial isolates and preliminary analyses.
Summary: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. Availability and implementation: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary
The genusLegionellacomprises 65 species, among whichLegionella pneumophilais a human pathogen causing severe pneumonia. To understand the evolution of an environmental to an accidental human pathogen, we have functionally analyzed 80Legionellagenomes spanning 58 species. Uniquely, an immense repository of 18,000 secreted proteins encoding 137 different eukaryotic-like domains and over 200 eukaryotic-like proteins is paired with a highly conserved type IV secretion system (T4SS). Specifically, we show that eukaryotic Rho- and Rab-GTPase domains are found nearly exclusively in eukaryotes andLegionella. Translocation assays for selected Rab-GTPase proteins revealed that they are indeed T4SS secreted substrates. Furthermore, F-box, U-box, and SET domains were present in >70% of all species, suggesting that manipulation of host signal transduction, protein turnover, and chromatin modification pathways are fundamental intracellular replication strategies for legionellae. In contrast, the Sec-7 domain was restricted toL. pneumophilaand seven other species, indicating effector repertoire tailoring within different amoebae. Functional screening of 47 species revealed 60% were competent for intracellular replication in THP-1 cells, but interestingly, this phenotype was associated with diverse effector assemblages. These data, combined with evolutionary analysis, indicate that the capacity to infect eukaryotic cells has been acquired independently many times within the genus and that a highly conserved yet versatile T4SS secretes an exceptional number of different proteins shaped by interdomain gene transfer. Furthermore, we revealed the surprising extent to which legionellae have coopted genes and thus cellular functions from their eukaryotic hosts, providing an understanding of how dynamic reshuffling and gene acquisition have led to the emergence of major human pathogens.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.