Comparative analysis of core genome MLST and SNP typing within a European Salmonella serovar Enteritidis outbreak

Pearce, Madison E.; Alikhan, Nabil-Fareed; Dallman, Timothy J.; Zhou, Zhemin; Grant, Kathie; Maiden, Martin

doi:10.1016/j.ijfoodmicro.2018.02.023

Cited by 146 publications

(145 citation statements)

References 56 publications

Supporting

Mentioning

135

Contrasting

Order By: Relevance

“…WGS has been shown to greatly enhance cluster detection, and improve resolution and accuracy in comparison to PFGE and MLVA in Salmonella , STEC and Listeria (Dallman et al., ; Morganti et al., ; Reimer et al., ; Ung et al., ; Waldram et al., ). The discrimination of the outbreak epidemiology provided by WGS is not possible to reach using traditional microbial typing methods (Pearce et al., ).…”

Section: Assessmentmentioning

confidence: 99%

“…14 • Whole genome MLST (wgMLST) is defined as a non-redundant set of genes that are present across a set of genomes representing a species, akin to a pan-genome. Consequently, a wgMLST scheme includes a greater number of genes and may also include highly variable elements such as repetitive genes and pseudogenes, if they are present in any included genome (Pearce et al, 2018).…”

Section: Mobile Genetic Elementmentioning

confidence: 99%

See 1 more Smart Citation

Whole genome sequencing and metagenomics for outbreak investigation, source attribution and risk assessment of food‐borne microorganisms

Koutsoumanis¹,

Allende²,

Álvarez‐Ordóñez³

et al. 2019

EFS2

104

View full text Add to dashboard Cite

This Opinion considers the application of whole genome sequencing (WGS) and metagenomics for outbreak investigation, source attribution and risk assessment of food-borne pathogens. WGS offers the highest level of bacterial strain discrimination for food-borne outbreak investigation and sourceattribution as well as potential for more precise hazard identification, thereby facilitating more targeted risk assessment and risk management. WGS improves linking of sporadic cases associated with different food products and geographical regions to a point source outbreak and can facilitate epidemiological investigations, allowing also the use of previously sequenced genomes. Source attribution may be favoured by improved identification of transmission pathways, through the integration of spatial-temporal factors and the detection of multidirectional transmission and pathogenhost interactions. Metagenomics has potential, especially in relation to the detection and characterisation of non-culturable, difficult-to-culture or slow-growing microorganisms, for tracking of hazard-related genetic determinants and the dynamic evaluation of the composition and functionality of complex microbial communities. A SWOT analysis is provided on the use of WGS and metagenomics for Salmonella and Shigatoxin-producing Escherichia coli (STEC) serotyping and the identification of antimicrobial resistance determinants in bacteria. Close agreement between phenotypic and WGSbased genotyping data has been observed. WGS provides additional information on the nature and localisation of antimicrobial resistance determinants and on their dissemination potential by horizontal gene transfer, as well as on genes relating to virulence and biological fitness. Interoperable data will play a major role in the future use of WGS and metagenomic data. Capacity building based on harmonised, quality controlled operational systems within European laboratories and worldwide is essential for the investigation of cross-border outbreaks and for the development of international standardised risk assessments of food-borne microorganisms.

show abstract

Section: Assessmentmentioning

confidence: 99%

Section: Mobile Genetic Elementmentioning

confidence: 99%

Whole genome sequencing and metagenomics for outbreak investigation, source attribution and risk assessment of food‐borne microorganisms

Koutsoumanis¹,

Allende²,

Álvarez‐Ordóñez³

et al. 2019

EFS2

104

View full text Add to dashboard Cite

show abstract

“…While whole genome comparison provides better resolution for pathogen disambiguation than FPGE 9 or gene-based or multi-locus sequence typing, [10][11][12][13][14] it also comes with bigger challenges. 4 The genomic 'distance' between pathogens can be defined in many ways, with or without a reference genome to guide the comparison.…”

Section: Introductionmentioning

confidence: 99%

“…Our goal is to compare and evaluate this conserved-sequence method against other core genome approaches typically used for whole-genome SNV comparison. Few prior work exists on core genome approaches for applications in clinical epidemiology, 13 so we aim to provide insight into the advantages and drawbacks of commonly used methods, including conserved-gene approaches 20 and approaches that select genome regions with sufficient coverage in a set of samples [21][22][23] . We illustrate that sample-dependent core genome definitions are not suitable for prospective studies because they lead to variable SNV distances as samples are added, which complicates clinical decision-making.…”

Section: Introductionmentioning

confidence: 99%

A novel core genome approach to enable prospective and dynamic monitoring of infectious outbreaks

Aggelen

Kolde

Chamarthi

et al. 2018

Preprint

View full text Add to dashboard Cite

Whole-genome sequencing is increasingly adopted in clinical settings to identify pathogen transmissions. Currently, such studies are performed largely retrospectively, but to be actionable they need to be carried out prospectively, in which samples are continuously added and compared to previous samples. To enable prospective pathogen comparison, genomic relatedness metrics based on single nucleotide differences must be consistent across time, efficient to compute and reliable for a large variety of samples. The choice of genomic regions to compare, i.e., the core genome, is critical to obtain a good metric.We propose a novel core genome method that selects conserved sequences in the reference genome by comparing its k-mer content to that of publicly available genome assemblies. The conserved-sequence genome is sample set-independent, which enables prospective pathogen monitoring. Based on clinical data sets of 3436 S. aureus, 1362 K. pneumoniae and 348 E. faecium samples, we show that the conserved-sequence genome disambiguates same-patient samples better than a core genome consisting of conserved genes. The conserved-sequence genome confirms outbreak samples with high accuracy: in a set of 2335 S. aureus samples, it correctly identifies 44 out of 45 outbreak samples, whereas the conserved gene method confirms 38 out of 45 outbreak samples.

show abstract

“…• 'cgMLST' scenario: we assume that we are interested in sequencing one quarter of 312 the genome consisting of 100 equally spaced loci each of 10 kb. This scenario 313 resembles the case in which one is interested in a core-genome multi-locus 314 sequence typing [16].…”

mentioning

confidence: 99%

Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design

Maio

Manser

Munro

et al. 2020

Preprint

View full text Add to dashboard Cite

Real-time selective sequencing of individual DNA fragments, or 'Read Until', allows the focusing of Oxford Nanopore Technology sequencing on pre-selected genomic regions.This can lead to large improvements in DNA sequencing performance in many scenarios where only part of the DNA content of a sample is of interest. This approach is based on the idea of deciding whether to sequence a fragment completely after having sequenced only a small initial part of it. If, based on this small part, the fragment is not deemed of (sufficient) interest it is rejected and sequencing is continued on a new fragment. To date, only simple decision strategies based on location within a genome have been proposed to determine what fragments are of interest. We present a new mathematical model and algorithm for the real-time assessment of the value of prospective fragments.Our decision framework is based not only on which genomic regions are a priori interesting, but also on which fragments have so far been sequenced, and so on the current information available regarding the genome being sequenced. As such, our strategy can adapt dynamically during each run, focusing sequencing efforts in areas of highest uncertainty (typically areas currently low coverage). We show that our approach 101 For each position i of a reference genome of length N , we denote π i (g) the 102 location-specific prior on genotypes g ∈ G before any data have been observed. In all 103 applications below, when considering a haploid genome, we define the prior of reference 104 February 7, 2020 5/31 nucleotide b R at position i as π i (b R ) = 1 − θ, with θ the genetic diversity of the 105 considered population. Conversely, π i (g) = θ/3 if g = b R . 106When considering diploid sequenced genomes, we still assume a haploid reference 107 genome, with reference nucleotide at a given position denoted b R . In the case of a 108 diploid unphased genome being sequenced, we define π i ({b R , b R }) = 1 − θ, and 109 π i ({g, g}) = p homo θ/3 if g = b R , with p homo being the proportion of site differences from 110 a reference that are expected to be homozygous, and π i ({g, b R }) = (1 − p homo )θ/3 for 111 g = b R . We ignore the possibility of a heterozygous genome being sequenced with both 112 alleles different from the reference genome. These prior probability definitions also 113 ignore differences in mutation rates across nucleotides and genome positions and do not 114 use prior knowledge on SNP locations derived from the population; when available, 115 these aspects could however easily be included in the definition of π i (g). 116

show abstract

Comparative analysis of core genome MLST and SNP typing within a European Salmonella serovar Enteritidis outbreak

Cited by 146 publications

References 56 publications

Whole genome sequencing and metagenomics for outbreak investigation, source attribution and risk assessment of food‐borne microorganisms

Whole genome sequencing and metagenomics for outbreak investigation, source attribution and risk assessment of food‐borne microorganisms

A novel core genome approach to enable prospective and dynamic monitoring of infectious outbreaks

Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design

Contact Info

Product

Resources

About