19The gram-positive enteropathogen Clostridioides difficile is the major cause of healthcare 20 associated diarrhoea and is also an important cause of community-acquired infectious 21 diarrhoea. Considering the burden of the disease, many studies have employed whole 22 genome sequencing to identify factors that contribute to virulence and pathogenesis. Though 23 extrachromosomal elements such as plasmids are important for these processes in other 24 bacteria, the few characterized plasmids of C. difficile have no relevant functions assigned 25 and no systematic identification of plasmids has been carried out to date. Here, we perform 26 an in silico analysis of publicly available sequence data, to show that ~13% of all C. difficile 27 strains contain extrachromosomal elements, with 1-6 elements per strain. Our approach 28 identifies known plasmids (e.g. pCD6, pCD630 and cloning plasmids) and 6 novel putative 29 plasmid families. Our study shows that plasmids are abundant and may encode functions 30 that are relevant for C. difficile physiology. The newly identified plasmids may also form the 31 basis for the construction of novel cloning plasmids for C. difficile that are compatible with 32 existing tools. 33 35
36Clostridioides difficile (Clostridium difficile) [1] is a gram-positive, endospore forming, 37 anaerobic bacterium. It is an opportunistic pathogen in humans, and is the causative agent 38 of most cases of antibiotic associated diarrhoea [2]. In recent years, the bacterium is also 39 increasingly found in cases of infectious diarrhoea that cannot be linked to healthcare 40 exposure [2]. C. difficile infections can be refractory to antimicrobial therapy and even when 41 initial cure is observed, relapses are frequent [2]. Typing methods for C. difficile include 42 (capillary) PCR ribotyping, multilocus sequence typing (MLST) and single nucleotide 43 polymorphism (SNP) typing after whole genome sequencing [3]. Since the beginning of the 44 21 th century, an increase in C. difficile infections due to epidemic types such as PCR 45 ribotype 027 and 078 has been noted [4, 5]. Although the molecular mechanisms underlying 46 the epidemicity are poorly understood and remain under debate [6], robust toxin production 47 and sporulation [7], altered surface properties [8], resistance to antimicrobials [5, 9] and an 48 increased ability to metabolize certain sugars [10] have been implicated. 49In other organisms, the contribution of plasmid-encoded functions to virulence and 50 pathogenesis is well-documented [11][12][13]. By contrast, only a limited number of plasmids has 51 been identified in C. difficile and all of these are cryptic, i.e. no traits have been associated 52 with plasmid carriage. Commonly used cloning vectors for C. difficile make use of a replicon 53 derived from the 6.8 kb plasmid pCD6 [14]. The reference strain 630 contains a single 54 plasmid, pCD630, that is part of a larger family of 7.8-11.8 kb plasmids [15, 16]. And recently 55 several large (>42 kb) plasmids were described [17]. Neverthele...