Shigella are pathogens originating within the Escherichia lineage but frequently classified as a separate genus. Shigella genomes contain numerous insertion sequences (ISs) that lead to pseudogenisation of affected genes and an increase of non-homologous recombination. Here, we study 414 genomes of E. coli and Shigella strains to assess the contribution of genomic rearrangements to Shigella evolution. We found that Shigella experienced exceptionally high rates of intragenomic rearrangements and had a decreased rate of homologous recombination compared to pathogenic and non-pathogenic E. coli. The high rearrangement rate resulted in independent disruption of syntenic regions and parallel rearrangements in different Shigella lineages. Specifically, we identified two types of chromosomally encoded E3 ubiquitin-protein ligases acquired independently by all Shigella strains that also showed a high level of sequence conservation in the promoter and further in the 5′-intergenic region. In the only available enteroinvasive E. coli (EIEC) strain, which is a pathogenic E. coli with a phenotype intermediate between Shigella and non-pathogenic E. coli, we found a rate of genome rearrangements comparable to those in other E. coli and no functional copies of the two Shigella-specific E3 ubiquitin ligases. These data indicate that the accumulation of ISs influenced many aspects of genome evolution and played an important role in the evolution of intracellular pathogens. Our research demonstrates the power of comparative genomics-based on synteny block composition and an important role of non-coding regions in the evolution of genomic islands.
Motivation High plasticity of bacterial genomes is provided by numerous mechanisms including horizontal gene transfer and recombination via numerous flanking repeats. Genome rearrangements such as inversions, deletions, insertions, and duplications may independently occur in different strains, providing parallel adaptation or phenotypic diversity. Specifically, such rearrangements might be responsible for virulence, antibiotic resistance, and antigenic variation. However, identification of such events requires laborious manual inspection and verification of phyletic pattern consistency. Results Here we define the term “parallel rearrangements” as events that occur independently in phylogenetically distant bacterial strains and present a formalization of the problem of parallel rearrangements calling. We implement an algorithmic solution for the identification of parallel rearrangements in bacterial populations as a tool PaReBrick. The tool takes a collection of strains represented as a sequence of oriented synteny blocks and a phylogenetic tree as input data. It identifies rearrangements, tests them for consistency with a tree, and sorts the events by their parallelism score. The tool provides diagrams of the neighbors for each block of interest, allowing the detection of horizontally transferred blocks or their extra copies and the inversions in which copied blocks are involved.We demonstrated PaReBrick’s efficiency and accuracy and showed its potential to detect genome rearrangements responsible for pathogenicity and adaptation in bacterial genomes. Availability PaReBrick is written in Python and is available on GitHub https://github.com/ctlab/parallelrearrangements Supplementary information Supplementary data are available at Bioinformatics online.
KeywordShigella, Escherichia coli, genomic rearrangements, pathogens, recombination, IS, ipaH AbstractThe genus Shigella comprises a polyphyletic group of facultative intracellular pathogens that evolved from Escherichia coli. Shigella genomes have accumulated mobile elements, which may have been caused by decreased effective population size and concomitant reduction of purifying selection that allowed their proliferation. Here, we investigated the interplay of the accumulation of genomic repeats with genomic rearrangements and their impact on adaptation in bacterial evolution. We studied 414 genomes of E. coli and Shigella strains to assess the contribution of genomic rearrangements to Shigella pathoadaptation. We show that Shigella accumulated a variety of insertion sequences (ISs), experienced exceptionally high rates of intragenomic rearrangements and had a decreased rate of homologous recombination. IS families differ in the expansion rates in Shigella lineages, as expected given their independent origin. In contrast, the number of IS elements and, consequently, the rate of genome rearrangements in the enteroinvasive E. coli strain (EIEC) strain are comparable to those in other E. coli. We found two chromosomal E3 ubiquitin-protein ligases (putative IpaH family proteins) that are functional in all Shigella strains, while only one pseudogenised copy is found in the EIEC strain and none in other E. coli. Taken together, our data indicate that ISs played an important role in the adaptation of Shigella strains to a intracellular lifestyle and that the composition of functional types of ubiquitinprotein ligases may explain the differences in the infectious dose and disease severity between Shigella and EIEC pathotypes. Impact statementPathogenic Escherichia coli frequently cause infections in humans. Many E. coli exist in nature and their ability to cause disease is fueled by their ability to incorporate novel genetic information by extensive horizontal gene transfer of plasmids and pathogenicity islands. The emergence of antibiotic-resistant Shigella, which is a pathogenic form of E. coli, coupled with the absence of an effective vaccine against them, highlights the importance of continued study of these pathogenic bacteria. Our study contributes to the understanding of genomic properties associated with molecular mechanisms underpinning the pathogenic nature of Shigella. We show the contribution of insertion sequences in adaptation of these intracellular pathogens and indicate a role of chromosomal ipaH genes in Shigella pathogenesis. The approaches developed in our study are broadly applicable to investigation of genotype-phenotype correlation in historically young bacterial pathogens. Data summary
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.