Salmonella enterica serovar Enteritidis is a major agent of foodborne diseases worldwide. In Uruguay, this serovar was almost negligible until the mid 1990s but since then it has become the most prevalent. Previously, we characterized a collection of strains isolated from 1988 to 2005 and found that the two oldest strains were the most genetically divergent. In order to further characterize these strains, we sequenced and annotated eight genomes including those of the two oldest isolates. We report on the identification and characterization of a novel 44 kbp Salmonella prophage found exclusively in these two genomes. Sequence analysis reveals that the prophage is a mosaic, with homologous regions in different Salmonella prophages. It contains 60 coding sequences, including two genes, gogB and sseK3, involved in virulence and modulation of host immune response. Analysis of serovar Enteritidis genomes available in public databases confirmed that this prophage is absent in most of them, with the exception of a group of 154 genomes. All 154 strains carrying this prophage belong to the same sequence type (ST-1974), suggesting that its acquisition occurred in a common ancestor. We tested this by phylogenetic analysis of 203 genomes representative of the intraserovar diversity. The ST-1974 forms a distinctive monophyletic lineage, and the newly described prophage is a phylogenetic signature of this lineage that could be used as a molecular marker. The phylogenetic analysis also shows that the major ST (ST-11) is polyphyletic and might have given rise to almost all other STs, including ST-1974.
Apiculate yeasts belonging to the genus Hanseniaspora are predominant on grapes and other fruits. While some species, such as Hanseniaspora uvarum, are well known for their abundant presence in fruits, they are generally characterized by their detrimental effect on fermentation quality because the excessive production of acetic acid. However, the species Hanseniaspora vineae is adapted to fermentation and currently is considered as an enhancer of positive flavour and sensory complexity in foods. Since 2002, we have been isolating strains from this species and conducting winemaking processes with them. In parallel, we also characterized this species from genes to metabolites. In 2013,we sequenced the genomes of two H. vineae strains, being these the first apiculate yeast genomes determined. In the last ten years, it has become possible to understand its biology, discovering very peculiar features compared to the conventional Saccharomyces yeasts, such as a natural and unique G2 cell cycle arrest or the elucidation of the mandelate pathway for benzenoids synthesis. All these characteristics contribute to phenotypes with proved interest from the biotechnological point of view for winemaking and the production of other foods.
Motivation The use of high precision for representing quality scores in nanopore sequencing data makes these scores hard to compress and, thus, responsible for most of the information stored in losslessly compressed FASTQ files. This motivates the investigation of the effect of quality score information loss on downstream analysis from nanopore sequencing FASTQ files. Results We polished de novo assemblies for a mock microbial community and a human genome, and we called variants on a human genome. We repeated these experiments using various pipelines, under various coverage level scenarios, and various quality score quantizers. In all cases we found that the quantization of quality scores causes little difference (or even sometimes improves) on the results obtained with the original (non-quantized) data. This suggests that the precision that is currently used for nanopore quality scores may be unnecessarily high, and motivates the use of lossy compression algorithms for this kind of data. Moreover, we show that even a non-specialized compressor, like gzip, yields large storage space savings after quantization of quality scores. Availability Quantizers freely available for download at: https://github.com/mrivarauy/QS-Quantizer Supplementary information Available at https://github.com/mrivarauy/QS-Quantizer
We investigate the effect of quality score information loss on downstream analysis from nanopore sequencing FASTQ files. We polished denovo assemblies for a mock microbial community and a human genome, and we called variants on a human genome. We repeated these experiments using various pipelines, under various coverage level scenarios, and various quality score quantizers. In all cases we found that the quantization of quality scores cause little difference on (or even improves) the results obtained with the original (non-quantized) data. This suggests that the precision that is currently used for nanopore quality scores is unnecessarily high, and motivates the use of lossy compression algorithms for this kind of data. Moreover, we show that even a non-specialized compressor, like gzip, yields large storage space savings after quantization of quality scores.
Salmonella enterica serovar Enteritidis is a major cause of foodborne disease in Uruguay since 1995. We used a genomic approach to study a set of isolates from different sources and years. Whole genome phylogeny showed that most of the strains are distributed in two major lineages (E1 and E2), both belonging to MLST sequence type 11 the major ST among serovar Enteritidis. Strikingly, E2 isolates are over-represented in periods of outbreak abundance in Uruguay, while E1 span all epidemic periods. Both lineages circulate in neighbor countries at the same timescale as in Uruguay, and are present in minor numbers in distant countries. We identified allelic variants associated with each lineage. Three genes, ycdX, pduD and hsdM, have distinctive variants in E1 that may result in defective products. Another four genes (ybiO, yiaN, aas, aceA) present variants specific for the E2 lineage. Overall this work shows that S. enterica serovar Enteritidis strains circulating in Uruguay have the same phylogenetic profile than strains circulating in the region, as well as in more distant countries. Based on these results we hypothesize that the E2 lineage, which is more prevalent during epidemics, exhibits a combination of allelic variants that could be associated with its epidemic ability. Salmonella is a major cause of human foodborne disease worldwide. A singular epidemiological feature of human salmonellosis is that one particular serovar can become prevalent over the others, but the prevalent serovar may change over time. Previous to the 1980's, Salmonella enterica serovar Typhimurium was the most commonly isolated serovar worldwide, but then at the beginning of the 1990's S. enterica serovar Enteritidis emerged as the most common cause of human salmonellosis, first in Europe and then in many other countries arround the world 1-6. The reasons for this serovar shift are still not fully understood. Several studies have addressed the phylogenetic diversity within S. enterica serovar Enteritidis, and suggested that diversification events could be related to its epidemiological features 7-10. The work of Allard et al. first reported the use of whole genome sequencing (WGS) and single nucleotide polymorphisms (SNP) analysis to address molecular epidemiology of a set of isolates previously indistinguishable by other techniques. Deng et al. suggested that serovar Enteritidis diversified from a few major lineages spread worldwide, and associated some lineages with geographic and epidemiological characteristics. Feasey et al. found a strong correlation between prophage content and accessory genome features with the establishment of a new epidemiological course. In our studies, we found that the acquisition of a new prophage was probably determinant for the onset of a particular lineage 8. The vast majority of S. enterica serovar Enteritidis isolates worldwide belong to the eBurst Group 4 (EBG4). Achtman et al. and more recently our group described that among EBG4, multi locus sequence type
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.