Genomics and outbreak investigation: from sequence to consequence

Robinson, Esther; Walker, Timothy M; Pallen, Mark J.

doi:10.1186/gm440

Cited by 72 publications

(76 citation statements)

References 58 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Genomic epidemiology has been increasingly applied to the study of epidemic-associated clonal variants of infectious agents to separate them from other related isolates (10,11). This field combines traditional epidemiological methods with genome sequence similarity analysis of bacterial isolates from patients and their potential reservoirs associated with different times and places.…”

mentioning

confidence: 99%

“…This approach was extensively applied, e.g., during the large multistate Escherichia coli O104:H4 epidemic (12). WGS will change the way we compare isolates because it allows for the analysis of all potential diversities and enables the detection of novelties associated with genome synteny, sequences, single genes, and polymorphisms between the isolates, as well as potential changes in the genomes that occur during infection (10,13). We recently found that only a few single-nucleotide polymorphisms (SNPs) are detected in C. jejuni after its intestinal passage through an infected patient (13).…”

mentioning

confidence: 99%

See 1 more Smart Citation

Genomic Variation between Campylobacter jejuni Isolates Associated with Milk-Borne-Disease Outbreaks

et al. 2014

View full text Add to dashboard Cite

bBacterial genome sequencing has led to the development of new approaches for the analysis of food-borne epidemics and the exploration of the relatedness of outbreak-associated isolates and their separation from nonassociated isolates. Using Illumina technology, we sequenced a total of six isolates (two from patients, two from raw bulk milk, and two from dairy cattle) associated with a milk-borne Campylobacter jejuni outbreak in a farming family and compared their genomes. These isolates had identical pulsed-field gel electrophoresis (PFGE) types, and their multilocus sequence typing (MLST) type was ST-50. We used the Ma_1 isolate (milk) as the reference, and its genome was assembled and tentatively ordered using the C. jejuni NCTC 11168 genome as the scaffold. Using whole-genome MLST (wgMLST), we identified a total of three single-nucleotide polymorphisms (SNPs) and differences in poly(G or C) or poly(A or T) tracts in 12 loci among the isolates. Several new alleles not present in the database were detected. In contrast, the sequences of the unassociated C. jejuni strains P14 and 1-12S (both ST-50) differed by 420 to 454 alleles from the epidemic-associated isolates. We found that the fecal contamination of bulk tank milk occurred by highly related sequence variants of C. jejuni, which are reflected as SNPs and differences in the length of the poly(A or T) tracts. Poly(G or C) tracts are reversibly variable and are thus unstable markers for comparison. Further, unrelated strains of ST-50 were clearly separated from the outbreak-associated isolates, indicating that wgMLST is an excellent tool for analysis. In addition, other useful data related to the genes and genetic systems of the isolates were obtained.

show abstract

mentioning

confidence: 99%

mentioning

confidence: 99%

Genomic Variation between Campylobacter jejuni Isolates Associated with Milk-Borne-Disease Outbreaks

et al. 2014

View full text Add to dashboard Cite

show abstract

“…The most reliable variants for building such phylogenies are single-nucleotide polymorphisms (SNPs). Thus, core-genome SNP typing is currently the standard method for reconstructing large phylogenies of closely related microbes [45]. Currently, there are three paradigms for core-genome SNP typing based on read mapping, k-mer analyses, and wholegenome alignment.…”

mentioning

confidence: 99%

The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes

Treangen¹,

Ondov²,

Koren³

et al. 2014

Genome Biol

489

624

View full text Add to dashboard Cite

Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest. RationaleMicrobial genomes represent over 93% of past sequencing projects, with the current total over 10,000 and growing exponentially. Multiple clades of draft and complete genomes comprising hundreds of closely related strains are now available from public databases [1], largely due to an increase in sequencing-based outbreak studies [2]. The quality of future genomes is also set to improve as shortread assemblers mature [3] and long-read sequencing enables finishing at greatly reduced costs [4,5].One direct benefit of high-quality genomes is that they empower comparative genomic studies based on multiple genome alignment. Multiple genome alignment is a fundamental tool in genomics essential for tracking genome evolution [6][7][8] [26,40], recombination, homoplasy, gene conversion, mobile genetic elements, pseudogenization, and convoluted orthology relationships [25]. In addition, the computational burden of multiple sequence alignment remains very high [41] despite recent progress [42].The current influx of microbial sequencing data necessitates methods for large-scale comparative genomics and shifts the focus towards scalability. Current microbial genome alignment methods focus on all-versus-all progressive alignment [31,36] to detect subset relationships (that is, gene gain/loss), but these methods are bounded at various steps by quadratic time complexity. This exponential growth in compute time prohibits comparisons involving thousands of genomes. Chan and Ragan [43] reiterated this point, emphasizing that current phylogenomic methods, such as multiple alignment, will not scale with the increasing number of genomes, and that 'alignment-free' or exact alignment methods must be used to analyze such datasets. However, such approaches do not come without compromising phylogenetic resolution [44].Core-genome alignment is a subset of whole-genome alignment, focused on identifying the set of orthologous

show abstract

“…Recently, high-throughput DNA sequencing, particularly bench-top sequencing, has brought many new opportunities to this field (42)(43)(44)(45) and has allowed bacterial genomics to be integrated into what might be called "public health microbiology version 2.0" (v2.0) through WGS of cultured isolates to provide simultaneous information on organism identity, epidemiology, and antimicrobial therapy (Fig. 2).…”

Section: The Role Of Clinicogenomics In Public Health Microbiologymentioning

confidence: 99%

Role of Clinicogenomics in Infectious Disease Diagnostics and Public Health Microbiology

et al. 2016

View full text Add to dashboard Cite

Clinicogenomics is the exploitation of genome sequence data for diagnostic, therapeutic, and public health purposes. Central to this field is the high-throughput DNA sequencing of genomes and metagenomes. The role of clinicogenomics in infectious disease diagnostics and public health microbiology was the topic of discussion during a recent symposium (session 161) presented at the 115th general meeting of the American Society for Microbiology that was held in New Orleans, LA. What follows is a collection of the most salient and promising aspects from each presentation at the symposium. The explosion of microbiome research is driven by highthroughput DNA sequencing, so-called next-generation sequencing (NGS), technologies that allow the genomic content of entire microbial communities (bacterial, viral, and eukaryotic organisms) to be described. Although much of this work is aimed at describing the structure of "commensal" communities, the methodology works equally well to identify pathogens in clinical samples. The key concept in using NGS methodology is that detection of microbes is independent of culture and is not limited to targets used for PCR assays. Rather, it is a process of generating large-scale sequence data sets that adequately sample a specimen for microbial content and then of applying computational methods to resolve the sequences into individual species, genes, pathways, or other features.Most microbiome analyses have focused on describing bacterial content, and this is usually performed by sequencing the 16S rRNA gene. PCR primers with degenerative sequences are used to amplify all or part of the 16S rRNA gene from a broad range of species in the sample. The mix of amplicons generated from different organisms in the community is then sequenced, and the abundance of each species is determined by the number of sequences found for its respective 16S rRNA gene. Although this is useful for defining communities, it also affords the identification of pathogens with unique 16S rRNA sequences.The sensitivity and specificity of this method are determined in large part by the NGS technology. Before NGS, the full-length 16S rRNA gene was sequenced with high-quality, 700-base-long reads of Sanger, or chain termination, sequencing (sometimes referred to as "first-generation" sequencing technology). This was laborious and expensive, and deep sampling was not possible. When NGS became available, most work was done on the FLX sequencing instrument (a second-generation sequencing technology) from 454 Life Sciences (Roche Diagnostics, Indianapolis, IN, USA). This only permitted 400-base-long sequencing reads, and only a portion of the 16S rRNA gene was sequenced. The 16S rRNA gene has nine hypervariable regions that provide much of the specificity in species identification. With 454 sequencing, typically only three of these regions can be sequenced. Nevertheless, this allowed detection to the genus level of most taxa. This methodology can correctly identify pathogens in stool samples from patients with diarrhea comp...

show abstract

Genomics and outbreak investigation: from sequence to consequence

Cited by 72 publications

References 58 publications

Genomic Variation between Campylobacter jejuni Isolates Associated with Milk-Borne-Disease Outbreaks

Genomic Variation between Campylobacter jejuni Isolates Associated with Milk-Borne-Disease Outbreaks

The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes

Role of Clinicogenomics in Infectious Disease Diagnostics and Public Health Microbiology

Contact Info

Product

Resources

About