2019
DOI: 10.1101/727230
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Global genomic population structure of Clostridioides difficile

Abstract: Clostridioides difficile is the primary infectious cause of antibiotic-associated diarrhea.Local transmissions and international outbreaks of this pathogen have been 50 previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce EnteroBase, a publicly accessible database (http://enterobase.warwick.ac.uk) that automatically retrieves and assembles C. difficile short-reads from the pu… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 47 publications
0
10
0
Order By: Relevance
“…Thus, cgST HierCC provides a robust approach to analyse population structures at multiple levels of resolution. The identification of closely related genomes using HierCC has been shown to be 89 % consistent between cgMLST and SNPs [34]. Neighbour-joining trees were reconstructed with NINJA – a hierarchical clustering algorithm for inferring phylogenies that is capable of scaling to inputs larger than 100, 000 sequences [35].…”
Section: Methodsmentioning
confidence: 99%
“…Thus, cgST HierCC provides a robust approach to analyse population structures at multiple levels of resolution. The identification of closely related genomes using HierCC has been shown to be 89 % consistent between cgMLST and SNPs [34]. Neighbour-joining trees were reconstructed with NINJA – a hierarchical clustering algorithm for inferring phylogenies that is capable of scaling to inputs larger than 100, 000 sequences [35].…”
Section: Methodsmentioning
confidence: 99%
“…PEPPA can construct a pan-genome from thousands of genomes with high genetic diversities, and earlier versions of this pipeline were used to generate cgMLST schemes for the genera represented in EnteroBase (Alikhan et al 2018;Frentrup et al 2019;Zhou et al 2020) as well as for ancient DNA analyses (Zhou et al 2018b; Achtman and Zhou 2019). Here we address the genus Streptococcus, which includes very genetically diverse strains and highly significant zoonotic and human pathogens (Gao et al 2014), to demonstrate how to use PEPPA to construct a pan-genome from representative genomes.…”
Section: A Pan-genome For the Genus Streptococcusmentioning
confidence: 99%
“…Similarly, earlier publications indicated fewer and fewer genes within the core-genome as more and more genomes were examined, but this has also not been revisited with the larger datasets. Finally, genetic relationships that are based on single nucleotide polymorphisms are extremely difficult to calculate in real-time with very large datasets, and some large databases focus on core genome Multilocus sequence typing (cgMLST) (Mellmann et al 2011;Moura et al 2016;Alikhan et al 2018;Frentrup et al 2019;Zhou et al 2020) for genotyping and recognizing relationships. cgMLST schemes within public databases calculate the allelic variants within a fixed set of core genes.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, cgST HierCC provides a robust approach to analyse population structures at multiple levels of resolution. The identification of closely-related genomes using HierCC has been shown to be 89% consistent between cgMLST and single-nucleotide polymorphisms (42). Neighbour-joining trees were reconstructed with Ninja—a hierarchical clustering algorithm for inferring phylogenies that is capable of scaling to inputs larger than 100,000 sequences (43).…”
Section: Methodsmentioning
confidence: 99%