Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized genomic profiles, its central database, and its ability to be run in a graphical user interface. However, creating a functional wgMLST scheme requires extended up-front development and subject-matter expertise. When a scheme does not exist or when the highest resolution is needed, SNP analysis is used. Using three Listeria outbreak data sets, we demonstrated the concordance between Lyve-SET SNP typing and wgMLST.Availability: Lyve-SET can be found at https://github.com/lskatz/Lyve-SET.
The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research laboratory and into the front lines of surveillance and outbreak response requires user-friendly, reproducible and scalable pipelines that have been well validated. Single Nucleotide Variant Phylogenomics (SNVPhyl) is a bioinformatics pipeline for identifying high-quality single-nucleotide variants (SNVs) and constructing a whole-genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity, and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds or in the presence of contamination. SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.
Salmonella enterica serovar Heidelberg is the second most frequently occurring serovar in Quebec and the third-most prevalent in Canada. Given that conventional pulsed-field gel electrophoresis (PFGE) subtyping for common Salmonella serovars, such as S. Heidelberg, yields identical subtypes for the majority of isolates recovered, public health laboratories are desperate for new subtyping tools to resolve highly clonal S. Heidelberg strains involved in outbreak events. As PFGE was unable to discriminate isolates from three epidemiologically distinct outbreaks in Quebec, this study was conducted to evaluate whole-genome sequencing (WGS) and phylogenetic analysis as an alternative to conventional subtyping tools. Genomes of 46 isolates from 3 Quebec outbreaks (2012, 2013, and 2014) supported by strong epidemiological evidence were sequenced and analyzed using a highquality core genome single-nucleotide variant (hqSNV) bioinformatics approach (SNV phylogenomics [SNVphyl] pipeline). Outbreaks were indistinguishable by conventional PFGE subtyping, exhibiting the same PFGE pattern (SHEXAI.0001/ SHEBNI.0001). Phylogenetic analysis based on hqSNVs extracted from WGS separated the outbreak isolates into three distinct groups, 100% concordant with the epidemiological data. The minimum and maximum number of hqSNVs between isolates from the same outbreak was 0 and 4, respectively, while >59 hqSNVs were measured between 2 previously indistinguishable outbreaks having the same PFGE and phage type, thus corroborating their distinction as separate unrelated outbreaks. This study demonstrates that despite the previously reported high clonality of this serovar, the WGS-based hqSNV approach is a superior typing method, capable of resolving events that were previously indistinguishable using classic subtyping tools. Nontyphoidal Salmonella enterica strains are important bacterial agents of salmonellosis in humans and animals (1) and represent up to 125,000 cases annually of foodborne gastroenteric disease arising from sporadic and outbreak events in Canada (2). More than 2,500 Salmonella enterica serovars have been described, but only a few have been associated with cases of human illness (3, 4). Salmonella Heidelberg ranks third and fourth among serovars causing human illness in Canada (5) and the United States (6), respectively, and is commonly detected in retail meat samples and food animals. While the majority of Salmonella infections are mild and self-limiting, S. Heidelberg can cause more severe diseases, including septicemia, myocarditis, extraintestinal infections, and death (7,8).Pulsed-field gel electrophoresis (PFGE) is the gold standard method used by Canadian public health laboratories for the molecular typing of S. Heidelberg, following standardized procedures set out by the PulseNet Canada guidelines. A well-recognized limitation of this classic typing method is that strains bearing highly common PFGE patterns occasionally render PFGE ineffective at detecting foodborne outbreaks from background sporadic cases, thus li...
22Motivation: The recent widespread application of whole-genome sequencing (WGS) for microbial 23
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.