The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacterial populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here, we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated 12 major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of 11 populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation.
Background The 16S rRNA gene is used extensively in bacterial phylogenetics, in species delineation, and now widely in microbiome studies. However, the gene suffers from intragenomic heterogeneity, and reports of recombination and an unreliable phylogenetic signal are accumulating. Here, we compare core gene phylogenies to phylogenies constructed using core gene concatenations to estimate the strength of signal for the 16S rRNA gene, its hypervariable regions, and all core genes at the intra- and inter-genus levels. Specifically, we perform four intra-genus analyses (Clostridium, n = 65; Legionella, n = 47; Staphylococcus, n = 36; and Campylobacter, n = 17) and one inter-genus analysis [41 core genera of the human gut microbiome (31 families, 17 orders, and 12 classes), n = 82]. Results At both taxonomic levels, the 16S rRNA gene was recombinant and subject to horizontal gene transfer. At the intra-genus level, the gene showed one of the lowest levels of concordance with the core genome phylogeny (50.7% average). Concordance for hypervariable regions was lower still, with entropy masking providing little to no benefit. A major factor influencing concordance was SNP count, which showed a positive logarithmic association. Using this relationship, we determined that 690 ± 110 SNPs were required for 80% concordance (average 16S rRNA gene SNP count was 254). We also found a wide range in 16S-23S-5S rRNA operon copy number among genomes (1–27). At the inter-genus level, concordance for the whole 16S rRNA gene was markedly higher (73.8% — 10th out of 49 loci); however, the most concordant hypervariable regions (V4, V3-V4, and V1-V2) ranked in the third quartile (62.5 to 60.0%). Conclusions Ramifications of a poor phylogenetic performance for the 16S rRNA gene are far reaching. For example, in addition to incorrect species/strain delineation and phylogenetic inference, it has the potential to confound community diversity metrics if phylogenetic information is incorporated — for example, with popular approaches such as Faith’s phylogenetic diversity and UniFrac. Our results highlight the problematic nature of these approaches and their use (along with entropy masking) is discouraged. Lastly, the wide range in 16S rRNA gene copy number among genomes also has a strong potential to confound diversity metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.