11Most bacteria exchange genetic material through Horizontal Gene Transfer (HGT). The primary vehicles for HGT 12 are plasmids and plasmid-borne transposable elements, though their population structure and dynamics remain 13 poorly understood. Here, we quantified genetic similarity between more than 10,000 bacterial plasmids and 14 reconstructed a network based on their shared k-mer content. Using a community detection algorithm, we assigned 15 plasmids into cliques which are highly correlated with plasmid gene content, bacterial host range, GC content, as 16 well as replicon and mobility (MOB) type classifications. Resolving the plasmid population structure further 17 allowed identification of candidates for yet-undescribed replicon genes. Our work provides biological insights 18 into the dynamics of plasmids and plasmid-borne mobile elements, with the latter representing the main drivers 19 of HGT at broad phylogenetic scales. Our results illustrate the potential of network-based analyses for the bacterial 20 'mobilome' and open up the prospect of a natural, exhaustive classification framework for bacterial plasmids. 21 48 assignments and can cover a potentially wider taxonomic range, however they are not applicable to the 49 classification of non-mobilizable plasmids. These two typing schemes have inspired several in silico classification 50 tools, such as PlasmidFinder 12 , the plasmid MultiLocus Sequence Typing (MLST) database, and MOB-suite 15 .
51However, all of these tools intrinsically rely on the completeness of their reference sequence databases, which 52 typically lack representatives from understudied and/or unculturable bacterial hosts.
53As bacterial plasmids undergo extensive recombination and HGT, their evolutionary history is not well captured 54 by phylogenetic trees, which are designed for the analysis of point mutation in sequence alignments 16,17 . Network 55 models offer an attractive alternative given they can incorporate both horizontal and vertical inheritance 18 , and 56 can deal with point mutations as well as structural variants. Networks have gained much attention in the past 57 decade as an alternative method for studying prokaryotic evolution, including plasmids 3,8,18,19 . Plasmid gene-58 sharing networks have proven a useful means to track AMR and virulence dissemination yielding deeper insights 59
Results
76
A dataset of complete bacterial plasmids 77A dataset of complete bacterial plasmids was assembled comprising 10,696 sequences found in bacteria from 22 78 phyla and over 400 genera ( Supplementary Table 1, Figure 1A, and Supplementary Figure 1). The composition 79 of plasmid hosts reflects current research interests, with the Proteobacteria and Firmicutes phyla together 80 representing over 84% of plasmid sequences. In total, 510,463 different Coding Sequences (CDSs) were identified 81 in the plasmid dataset. 66.01% of the CDSs were predicted to encode a hypothetical protein, 27.9% had a known 82 product with Gene Ontology (GO) biological process annotation, with the remai...