2021
DOI: 10.1186/s13059-021-02393-0
|View full text |Cite
|
Sign up to set email alerts
|

GUNC: detection of chimerism and contamination in prokaryotic genomes

Abstract: Genomes are critical units in microbiology, yet ascertaining quality in prokaryotic genome assemblies remains a formidable challenge. We present GUNC (the Genome UNClutterer), a tool that accurately detects and quantifies genome chimerism based on the lineage homogeneity of individual contigs using a genome’s full complement of genes. GUNC complements existing approaches by targeting previously underdetected types of contamination: we conservatively estimate that 5.7% of genomes in GenBank, 5.2% in RefSeq, and… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
135
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 156 publications
(136 citation statements)
references
References 52 publications
1
135
0
Order By: Relevance
“…As MAGs are not always submitted to INSDC repositories, we are exploring incorporation of additional repositories into the GTDB such as MGnify ( 38 ). The quality of MAGs is an ongoing concern and we are evaluating methods for decontaminating MAGs ( 39 ) and identifying and removing MAGs that may contribute to phylogenetic instability ( 40 ). Related to MAG quality, we expect to change the AF criteria used for assigning genomes to GTDB species clusters from 0.65 to 0.5 starting in the next GTDB release in order to better accommodate the growing number of incomplete MAGs contained in the GTDB which can cause the AF between MAGs to be artificially low.…”
Section: Concluding Remarks and Future Plansmentioning
confidence: 99%
“…As MAGs are not always submitted to INSDC repositories, we are exploring incorporation of additional repositories into the GTDB such as MGnify ( 38 ). The quality of MAGs is an ongoing concern and we are evaluating methods for decontaminating MAGs ( 39 ) and identifying and removing MAGs that may contribute to phylogenetic instability ( 40 ). Related to MAG quality, we expect to change the AF criteria used for assigning genomes to GTDB species clusters from 0.65 to 0.5 starting in the next GTDB release in order to better accommodate the growing number of incomplete MAGs contained in the GTDB which can cause the AF between MAGs to be artificially low.…”
Section: Concluding Remarks and Future Plansmentioning
confidence: 99%
“…There remains an urgent need for methods to identify non–cognate contigs in fractionated assemblies, with the impact of contamination on gene–level becoming more widely recognised (Arkhipova, 2020), and one recently published analysis suggests that up to 15–30% of publicly–available MAGs classified at pHQ level will harbour chimeric content (Orakov et al, 2021). In the present study, we have examined removal of possible contamination using the RefineM workflow (Parks et al, 2017).…”
Section: Discussionmentioning
confidence: 99%
“…MAGpurify [10] removes scaffolds that are far from those from the same MAG by considering information from multiple sources, such as phylogenetic marker genes, clade-specific markers and GC content. GUNC [145] calculates the clade separation and reference representation scores to quantify genomic chimerism. The construction of MAGs should ideally be performed for all of the high-, medium- and low-abundance microbes.…”
Section: Outlook Potential Challenges and Strategies To Address Themmentioning
confidence: 99%