2020
DOI: 10.1101/gr.260828.120
|View full text |Cite
|
Sign up to set email alerts
|

Accurate reconstruction of bacterial pan- and core genomes with PEPPAN

Abstract: Service Email Alerting click here. top right corner of the article or Receive free email alerts when new articles cite this article-sign up in the box at the http://genome.cshlp.org/subscriptions

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
91
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 81 publications
(93 citation statements)
references
References 78 publications
(125 reference statements)
1
91
0
1
Order By: Relevance
“…We used the whole-genome MLST (wgMLST) scheme for the genus Salmonella in EnteroBase [ 3 , 20 ] to test the accuracy of BlastFrost for detecting sequence variants of a large number of query sequences in a large number of related genomes. That wgMLST scheme consists of 21,065 single copy orthologs which had been derived from a pan-genome of 537 representative genomes of Salmonella with PEPPAN [ 20 , 21 ]. EnteroBase identifies diverse sequence variants of those loci in each assembled genome by combining BLASTN [ 8 ] nucleotide and UBLAST [ 22 ] amino acid queries, and also scores the absence of significant hits for each genome.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We used the whole-genome MLST (wgMLST) scheme for the genus Salmonella in EnteroBase [ 3 , 20 ] to test the accuracy of BlastFrost for detecting sequence variants of a large number of query sequences in a large number of related genomes. That wgMLST scheme consists of 21,065 single copy orthologs which had been derived from a pan-genome of 537 representative genomes of Salmonella with PEPPAN [ 20 , 21 ]. EnteroBase identifies diverse sequence variants of those loci in each assembled genome by combining BLASTN [ 8 ] nucleotide and UBLAST [ 22 ] amino acid queries, and also scores the absence of significant hits for each genome.…”
Section: Resultsmentioning
confidence: 99%
“…All MLST schemes are inherently limited, because they are based on a fixed selection of genes that were present in an initial, representative set of genomes. However, many bacterial genera are associated with open pan-genomes [ 33 ], whose content continues to increase with each additional genome that is sequenced [ 21 ], and such novel sequences are not routinely appended to the MLST schemes. Therefore, it is important to emphasize that BlastFrost and Bifrost are not dependent on MLST or on genomic annotations, but can handle any collection of closely related genomic assemblies.…”
Section: Discussionmentioning
confidence: 99%
“…Given the myriad of software pipelines for pangenome reconstruction available (over 40) ( Vernikos, 2020 ), we assessed three different pipelines to define the pangenome of the Blautia dataset: Roary ( Page et al, 2015 ), panX ( Ding et al, 2018 ) and PEPPAN ( Zhou et al, 2020 ). Roary and panX, well-established packages based on their number of citations and publication dates, define clusters of homologous proteins in a broadly similar way.…”
Section: Resultsmentioning
confidence: 99%
“…The pangenome reconstruction of the blautia dataset was performed with Roary ( Page et al, 2015 ), panX ( Ding et al, 2018 ) and PEPPAN ( Zhou et al, 2020 ). For all the programs, input files were generated by prokka (default settings, –kingdom Bacteria ) ( Seemann, 2014 ).…”
Section: Methodsmentioning
confidence: 99%
“…The pangenome reconstruction of the blautia dataset was performed with Roary (Page et al, 2015), panX (Ding et al, 2018) and PEPPAN (Zhou et al, 2020). For all the programs, input files were generated by prokka (default settings, -kingdom Bacteria) (Seemann, 2014).…”
Section: Pangenome Analysismentioning
confidence: 99%