Mapping of high-throughput sequencing (HTS) reads to a single arbitrary reference genome is a frequently used approach in microbial genomics. However, the choice of a reference may represent a source of errors that may affect subsequent analyses such as the detection of single nucleotide polymorphisms (SNPs) and phylogenetic inference. In this work, we evaluated the effect of reference choice on short-read sequence data from five clinically and epidemiologically relevant bacteria (Klebsiella pneumoniae, Legionella pneumophila, Neisseria gonorrhoeae, Pseudomonas aeruginosa and Serratia marcescens). Publicly available whole-genome assemblies encompassing the genomic diversity of these species were selected as reference sequences, and read alignment statistics, SNP calling, recombination rates, dN/dS ratios, and phylogenetic trees were evaluated depending on the mapping reference. The choice of different reference genomes proved to have an impact on almost all the parameters considered in the five species. In addition, these biases had potential epidemiological implications such as including/excluding isolates of particular clades and the estimation of genetic distances. These findings suggest that the single reference approach might introduce systematic errors during mapping that affect subsequent analyses, particularly for data sets with isolates from genetically diverse backgrounds. In any case, exploring the effects of different references on the final conclusions is highly recommended.
The emergence of multidrug-resistant bacteria is a major global health concern. The search for new therapies has brought bacteriophages into the spotlight, and new phages are being described as possible therapeutic agents. Among the bacteria that are most extensively resistant to current antibiotics is Klebsiella pneumoniae, whose hypervariable extracellular capsule makes treatment particularly difficult. Here, we describe two new K. pneumoniae phages, πVLC5 and πVLC6, isolated from environmental samples. These phages belong to the genus Drulisvirus within the family Podoviridae. Both phages encode a similar tail spike protein with putative depolymerase activity, which is shared among other related phages and probably determines their ability to specifically infect K. pneumoniae capsular types K22 and K37. In addition, we found that phage πVLC6 also infects capsular type K13 and is capable of striping the capsules of K. pneumoniae KL2 and KL3, although the phage was not infectious in these two strains. Genome sequence analysis suggested that the extended tropism of phage πVLC6 is conferred by a second, divergent depolymerase. Phage πVLC5 encodes yet another putative depolymerase, but we found no activity of this phage against capsular types other than K22 and K37, after testing a panel of 77 reference strains. Overall, our results confirm that most phages productively infected one or few Klebsiella capsular types. This constitutes an important challenge for clinical applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.