2020
DOI: 10.1099/mgen.0.000341
|View full text |Cite
|
Sign up to set email alerts
|

An assessment of genome annotation coverage across the bacterial tree of life

Abstract: Although gene-finding in bacterial genomes is relatively straightforward, the automated assignment of gene function is still challenging, resulting in a vast quantity of hypothetical sequences of unknown function. But how prevalent are hypothetical sequences across bacteria, what proportion of genes in different bacterial genomes remain unannotated, and what factors affect annotation completeness? To address these questions, we surveyed over 27 000 bacterial genomes from the Genome Taxonomy Database, and measu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
82
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 77 publications
(82 citation statements)
references
References 49 publications
0
82
0
Order By: Relevance
“…This suggests that the root cause(s) of the phenotypes observed in the lomA mutant remain(s) elusive, but are likely due to the preferentially impacted COGs (signaling, motility, and envelope biogenesis). This complexity is compounded by the fact that Leptospira , and, more generally, all spirochetes, possess many putative genes that code for proteins of unknown function with little to no sequence identity with characterized proteins from model organisms ( 66 ).…”
Section: Discussionmentioning
confidence: 99%
“…This suggests that the root cause(s) of the phenotypes observed in the lomA mutant remain(s) elusive, but are likely due to the preferentially impacted COGs (signaling, motility, and envelope biogenesis). This complexity is compounded by the fact that Leptospira , and, more generally, all spirochetes, possess many putative genes that code for proteins of unknown function with little to no sequence identity with characterized proteins from model organisms ( 66 ).…”
Section: Discussionmentioning
confidence: 99%
“…The LookingGlass 'universal language of life' creates representations of DNA sequences that capture their functional and evolutionary relevance, independent of whether the sequence is contained in reference databases. The vast majority of microbial diversity is uncultured and unannotated [4][5][6] . LookingGlass opens the door to .…”
Section: Discussionmentioning
confidence: 99%
“…The microbial world is dominated by "microbial dark matter" -the majority of microbial genomes remain to be sequenced 4,5 , while the molecular functions of many genes in microbial genomes are unknown 6 . In microbial communities (microbiomes), the combination of these factors compounds this limitation.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, the vast majority of organisms has been sketchily annotated, only through electronic inference based on homology associations (Gaudet et al, 2011). This causes inconsistencies regarding the semantic network profile describing biological processes across species, and even taxonomically related species could have divergence in annotation coverage (Lobb et al, 2020). All these predicate upon a standardized version of GO, suitable for comparative, evolutionary analyses.…”
Section: Pathway Analysis -A Pn Semantic Profiles Based On the Biologmentioning
confidence: 99%