2021
DOI: 10.1038/s42003-020-01626-5
|View full text |Cite
|
Sign up to set email alerts
|

Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups

Abstract: In this study, more than one hundred thousand Escherichia coli and Shigella genomes were examined and classified. This is, to our knowledge, the largest E. coli genome dataset analyzed to date. A Mash-based analysis of a cleaned set of 10,667 E. coli genomes from GenBank revealed 14 distinct phylogroups. A representative genome or medoid identified for each phylogroup was used as a proxy to classify 95,525 unassembled genomes from the Sequence Read Archive (SRA). We find that most of the sequenced E. coli geno… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

10
78
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 75 publications
(103 citation statements)
references
References 57 publications
10
78
0
1
Order By: Relevance
“…E. coli strains have been classified, revealing a variety of phylogenetic groups [ 42 ], the most prominent of which are A, B1, B2, D, E and S [ 43 , 44 ]. Commensal strains in humans belong mostly to phylogroup A [ 40 ], while pathogenic strains belong mostly to phylogroups B2 and D [ 44 , 45 ].…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…E. coli strains have been classified, revealing a variety of phylogenetic groups [ 42 ], the most prominent of which are A, B1, B2, D, E and S [ 43 , 44 ]. Commensal strains in humans belong mostly to phylogroup A [ 40 ], while pathogenic strains belong mostly to phylogroups B2 and D [ 44 , 45 ].…”
Section: Resultsmentioning
confidence: 99%
“…The enterohemorrhagic O157:H7 strains are in group E, whereas Shigella falls into the additional group S [ 46 ]. In a recent analysis, over 100,000 E. coli genomes were used for an extensive phylogenetic analysis [ 42 ]. This analysis allowed the more detailed definition of 14 phylogroups G, B2-1, B2-2, F, D1, D2, D3, E2, E1, A, C, B1, Shig1 and Shig2.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Für die meisten dieser unzähligen unterschiedlichen Gene ist keine Funktion bekannt. Selbst beim bestuntersuchten Mikroorganismus Escherichia coli K-12 ist für etwa ein Drittel der 4.500 Gene die Funktion unbekannt -und diese Zahl steigt nochmal signifi kant an, wenn man das Pangenom von E. coli mit über 100.000 verschiedenen Genclustern einbezieht [2]. Bei weniger gut untersuchten Mi kroorganismen ist diese Zahl noch deutlich höher [1].…”
Section: Vaam-forschungspreis 2021unclassified