f Serotyping forms the basis of national and international surveillance networks for Salmonella, one of the most prevalent foodborne pathogens worldwide (1-3). Public health microbiology is currently being transformed by whole-genome sequencing (WGS), which opens the door to serotype determination using WGS data. SeqSero (www.denglab.info/SeqSero) is a novel Webbased tool for determining Salmonella serotypes using high-throughput genome sequencing data. SeqSero is based on curated databases of Salmonella serotype determinants (rfb gene cluster, fliC and fljB alleles) and is predicted to determine serotype rapidly and accurately for nearly the full spectrum of Salmonella serotypes (more than 2,300 serotypes), from both raw sequencing reads and genome assemblies. The performance of SeqSero was evaluated by testing (i) raw reads from genomes of 308 Salmonella isolates of known serotype; (ii) raw reads from genomes of 3,306 Salmonella isolates sequenced and made publicly available by GenomeTrakr, a U.S. national monitoring network operated by the Food and Drug Administration; and (iii) 354 other publicly available draft or complete Salmonella genomes. We also demonstrated Salmonella serotype determination from raw sequencing reads of fecal metagenomes from mice orally infected with this pathogen. SeqSero can help to maintain the well-established utility of Salmonella serotyping when integrated into a platform of WGS-based pathogen subtyping and characterization.
Salmonella is the most prevalent foodborne pathogen in the United States, causing 1.2 million cases of illness annually and the largest health burden among all bacterial pathogens (4). The U.S. National Salmonella Surveillance System has been built upon serotyping in public health laboratories, a subtyping method traditionally performed through the agglutination of Salmonella cells with specific antisera that detect lipopolysaccharide O antigen and flagellar H antigens. Specific combinations of O and H antigenic types represent serotypes (or serovars). More than 2,500 Salmonella serotypes have been described in the White-Kauffmann-Le Minor scheme (5, 6). The phenotypic determination of serotypes is labor-intensive and time-consuming (taking at least 2 days), which has led to the development of genetic methods for serotype determination (7,8). These methods generally use two categories of targets for serotype determination: (i) indirect targets, requiring the use of random surrogate genomic markers associated with particular serotypes, and (ii) direct targets, requiring the use of genetic determinants of serotypes, including the rfb gene cluster responsible for somatic (O) group synthesis (9, 10) and the fliC (11) and fljB (12) genes encoding the two flagellar antigens present in Salmonella. The latter approach has the advantage of determining serotypes using the same markers as the phenotypic method, providing continuity between the serotypes determined by phenotypic and genetic markers (13,14). While this approach may result in distinct genetic lineages bei...