2019
DOI: 10.1038/s41598-019-51284-9
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers

Abstract: Remarkable advancements in high-throughput gene sequencing technologies have led to an exponential growth in the number of sequenced genomes. However, unavailability of highly parallel and scalable de novo assembly algorithms have hindered biologists attempting to swiftly assemble high-quality complex genomes. Popular de Bruijn graph assemblers, such as IDBA-UD, generate high-quality assemblies by iterating over a set of k-values used in the construction of de Bruijn graphs (DBG). However, this process of sequ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 21 publications
1
13
0
Order By: Relevance
“…The new version of GenomeScope 2 is also applicable to polyploid genomes. Finally, some assemblers, such as IDBA [41], IDBA-UD [42] and subsequently SPAdes, SOAPdenovo2, or the recently developed ScalaDBG [43], have implemented innovative ways to deal with the choice of the best k-mer by using a multi-k-mer approach.…”
Section: Wgs Metagenomicsmentioning
confidence: 99%
“…The new version of GenomeScope 2 is also applicable to polyploid genomes. Finally, some assemblers, such as IDBA [41], IDBA-UD [42] and subsequently SPAdes, SOAPdenovo2, or the recently developed ScalaDBG [43], have implemented innovative ways to deal with the choice of the best k-mer by using a multi-k-mer approach.…”
Section: Wgs Metagenomicsmentioning
confidence: 99%
“…It employs DSK [47] to count k-mers and (k +1)-mers which only requires a fixed user-defined amount of memory. Assembler ScalaDBG [48] is a scalable genome assembler through parallel de-Bruijn graph construction for multiple k-mers. This assembler first performs graph construction in parallel for each k-value, then for each pair of graphs, the higher k-valued graph is patched using the lower k-valued graph to generate a single graph.…”
Section: ) De-bruijn Graph-based Assemblersmentioning
confidence: 99%
“…The k-mers that appear above a certain threshold frequency, and are therefore expected to be legitimate, are solid k-mers, the others are called insolid or untrusted k-mers. It has been found that performance of EC tools is extremely sensitive to the chosen k-value [17,4], and the optimal value is both tooland dataset-dependent. Small values of k result in an increase in the probability of overlap between reads at the cost of not allowing the algorithm to distinguish between erroneous and correct k-mers.…”
Section: K-mer Based Ec Toolsmentioning
confidence: 99%
“…The drop in high-throughput sequencing costs has offered unprecedented opportunities to characterize genomes, metagenomes, and single-cell genomes across the tree-of-life. Third-generation sequencing technologies [1,2] have demonstrated the potential to produce unparalleled genome assemblies due to their capability to generate staggeringly longer reads, at a much faster pace, and remarkably low costs [3,4], albeit, at low signal-to-noise ratios. This fundamental challenge makes genomic sequence identification and assembly, challenging, often necessitating hybrid correction approaches over self-correction from a consensus of long reads [5].…”
Section: Introductionmentioning
confidence: 99%