2024
DOI: 10.3389/fbinf.2024.1278228
|View full text |Cite
|
Sign up to set email alerts
|

Ten common issues with reference sequence databases and how to mitigate them

Samuel D. Chorlton

Abstract: Metagenomic sequencing has revolutionized our understanding of microbiology. While metagenomic tools and approaches have been extensively evaluated and benchmarked, far less attention has been given to the reference sequence database used in metagenomic classification. Issues with reference sequence databases are pervasive. Database contamination is the most recognized issue in the literature; however, it remains relatively unmitigated in most analyses. Other common issues with reference sequence databases inc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(1 citation statement)
references
References 105 publications
0
1
0
Order By: Relevance
“…Additionally, the accuracy of the bioinformatics tools and databases used can impact the final species identification ( 16 ). (iv) Incompleteness of databases and reference sequences: many microbes have not yet been cultured and sequenced, meaning that even with full-length 16S rRNA sequencing, it might not be possible to accurately identify all species ( 45 ). (v) Ecological and evolutionary factors: gene horizontal transfer and recombination events among microbial species can cause differences in the 16S rRNA gene sequence, which may mislead species identification and analysis ( 46 ).…”
Section: Discussionmentioning
confidence: 99%
“…Additionally, the accuracy of the bioinformatics tools and databases used can impact the final species identification ( 16 ). (iv) Incompleteness of databases and reference sequences: many microbes have not yet been cultured and sequenced, meaning that even with full-length 16S rRNA sequencing, it might not be possible to accurately identify all species ( 45 ). (v) Ecological and evolutionary factors: gene horizontal transfer and recombination events among microbial species can cause differences in the 16S rRNA gene sequence, which may mislead species identification and analysis ( 46 ).…”
Section: Discussionmentioning
confidence: 99%