2022
DOI: 10.1093/bioinformatics/btac448
|View full text |Cite
|
Sign up to set email alerts
|

BubbleGun: enumerating bubbles and superbubbles in genome graphs

Abstract: Motivation With the fast development of sequencing technology, accurate de novo genome assembly is now possible even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs. … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 16 publications
0
4
0
Order By: Relevance
“…4f ) does not have a superbubble when and are taken as different vertices. Dabbaghie et al (2022) implemented the algorithm by Onodera et al (2013) for sequence graphs, but how the issues above are handled is not apparent. The closest concept to weak superbubble in bidirected graphs is snarl ( Paten et al 2018 ), which leads to a bidirected subgraph that can be separated from the rest of the graph.…”
Section: Methodsmentioning
confidence: 99%
“…4f ) does not have a superbubble when and are taken as different vertices. Dabbaghie et al (2022) implemented the algorithm by Onodera et al (2013) for sequence graphs, but how the issues above are handled is not apparent. The closest concept to weak superbubble in bidirected graphs is snarl ( Paten et al 2018 ), which leads to a bidirected subgraph that can be separated from the rest of the graph.…”
Section: Methodsmentioning
confidence: 99%
“…The HiFi cMAGs were filtered from initial assemblies using an adapted workflow published before (Kim et al, 2022). The workflow is briefly described as follows: (1) the circular contigs without assembly bubbles (assessed by BubbleGun v1.1.6 (Dabbaghie et al, 2022) as required) and repeats were filtered from the three assemblies; (2) contigs with sequence lengths shorter than 100 kb were removed; (3) the completeness and taxonomic classification of remaining contigs were assessed using GTDB-Tk v2.1.1 (Parks et al, 2022), and contigs with bacterial marker genes (bac120) less than 80 were removed (at this step, no archaea genome was found); (4) the SSU rRNA of the remaining contigs was predicted using barrnap v0.9 (https://github.com/tseemann/barrnap), and contigs missing 5S, 16S or 23S rRNA were removed; (5) the tRNA of remaining contigs was identified using tRNAscan-SE v2.0.11 (Chan et al, 2021), and contigs with tRNA types fewer than 18 were filtered out; and (6) ANI values of contigs were calculated using FastANI v1.33 (Jain et al, 2018). Redundant contigs were those with ANI values ≥ 99% and alignment percentages ≥ 95%, and the contigs with the maximal sum of ANI values were selected as the representative sequences.…”
Section: Methodsmentioning
confidence: 99%
“…The HiFi cMAGs were filtered from initial assemblies using an adapted workflow published before (Kim et al, 2022). The workflow is briefly described as follows: (1) the circular contigs without assembly bubbles (assessed by BubbleGun v1.1.6 (Dabbaghie et al, 2022) as required) and repeats were filtered from the three assemblies;…”
Section: Filtering Taxonomic Classification and Functional Annotation...mentioning
confidence: 99%