Clostridioides difficile is the primary infectious cause of antibiotic-associated diarrhea.Local transmissions and international outbreaks of this pathogen have been 50 previously elucidated by bacterial whole-genome sequencing, but comparative genomic analyses at the global scale were hampered by the lack of specific bioinformatic tools. Here we introduce EnteroBase, a publicly accessible database (http://enterobase.warwick.ac.uk) that automatically retrieves and assembles C. difficile short-reads from the public domain, and calls alleles for core-genome 55 multilocus sequence typing (cgMLST). We demonstrate that the identification of highly related genomes is 89% consistent between cgMLST and single-nucleotide polymorphisms. EnteroBase currently contains 13,515 quality-controlled genomes which have been assigned to hierarchical sets of single-linkage clusters by cgMLST distances. Hierarchical clustering can be used to identify populations of C. difficile at 60 all epidemiological levels, from recent transmission chains through to pandemic and endemic strains, and is largely compatible with prior ribotyping. Hierarchical clustering thus enables comparisons to earlier surveillance data and will facilitate communication among researchers, clinicians and public-health officials who are combatting disease caused by C. difficile. 65 toxin variant Clostridium difficile PCR ribotype 017 reveals the evolution of two