2021
DOI: 10.1371/journal.pbio.3001421
|View full text |Cite
|
Sign up to set email alerts
|

Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences

Abstract: The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
114
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 96 publications
(115 citation statements)
references
References 60 publications
(70 reference statements)
1
114
0
Order By: Relevance
“…To demonstrate the utility of our approach, we searched for and subtyped FII-33 plasmids in a curated collection of 661,405 bacterial draft genome sequences ( 27 ). The FII-33 repA1 gene was detected in 423 genomes.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…To demonstrate the utility of our approach, we searched for and subtyped FII-33 plasmids in a curated collection of 661,405 bacterial draft genome sequences ( 27 ). The FII-33 repA1 gene was detected in 423 genomes.…”
Section: Resultsmentioning
confidence: 99%
“…The sequences in Text S1 were used to query the COBS index ( 26 ) of 661,405 curated draft genomes ( 27 ) with a kmer similarity cutoff of 1.00 such that only identical sequences were detected. The estimated core genome distances of the 423 genomes that contained the FII-33 repA1 gene were extracted from the pp-sketch index of the 661,405-genome collection ( 27 ) using pp-sketchlib (version 1.5.1) ( 28 ).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For reference sketch construction, we used a collection of assemblies containing bacterial genomes from the entire European Nucleotide Archive (ENA) in 2018 (n = 660,333) (25). Metadata from precomputed assembly genotypes was used to subset assemblies with complete lineage designation for inclusion (MLST).…”
Section: Methodsmentioning
confidence: 99%
“…We searched for the presence of mutations identified in our study in publicly available sequencing data using a searchable COBS index of bacterial genomes curated from the European Nucleotide Archive (Blackwell et al, 2021). We used the Python interface to search the COBS index (Bingmann et al, 2019).…”
Section: Methodsmentioning
confidence: 99%