2021
DOI: 10.1101/2021.02.26.432990
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DENTIST – using long reads for closing assembly gaps at high accuracy

Abstract: Long sequencing reads allow increasing contiguity and completeness of fragmented, short-read based genome assemblies by closing assembly gaps, ideally at high accuracy. While several gap closing methods have been developed, these methods often close an assembly gap with sequence that does not accurately represent the true sequence. Here, we developed DENTIST, a sensitive, highly-accurate and automated pipeline method to close gaps in short read assemblies with long reads. DENTIST comprehensively determines rep… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…Furthermore, we distributed the computation on our compute cluster by (i) computing the read alignments for the whole assembly, (ii) splitting the assembly into blocks of ∼200 Mb and dividing the read alignments accordingly, (iii) applying PacBio GC to each block separately, and (iv) merging the unmodified contigs with the processed gaps into the output assembly. This complete workflow can be found at [ 31, 32 ].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, we distributed the computation on our compute cluster by (i) computing the read alignments for the whole assembly, (ii) splitting the assembly into blocks of ∼200 Mb and dividing the read alignments accordingly, (iii) applying PacBio GC to each block separately, and (iv) merging the unmodified contigs with the processed gaps into the output assembly. This complete workflow can be found at [ 31, 32 ].…”
Section: Methodsmentioning
confidence: 99%
“…All data underlying this article, including the reference and test assemblies with introduced gaps and their true sequence as valuable data for future method comparisons, are available via our institutional server [ 31 ]. Supporting data and an archival copy of the code are also available via the GigaScience repository GigaDB [ 33 ].…”
Section: Data Availabilitymentioning
confidence: 99%
“…LoRMA [150], NaS [151], proovread [152] Long reads Canu [114], CONSENT [153], Daccord [154], FLAS [155], HALC [156], NextDenovo [121], MECAT [118], MECAT2 [118], NECAT [120] Polishing Short reads ntEdit [157], Pilon [158], POLCA [159] Short & long reads Apollo [160], Hapo-G [161], HyPo [162], Racon [163] Nanopolish [164], Quiver [165] Haplotig purging Long reads HaploMerger2 [166], purge dups [167], Purge Haplotigs [168] Scaffolding Short reads Bambus [169], BATISCAF [170], BESST [171], BOSS [172], Mate pairs GRASS [173], MIP [174], Opera [175], ScaffMatch [176], ScaffoldScaffolder [177], SCARPA [178], SCOP [179], SLIQ [180], SOPRA [181], SSPACE [182], WiseScaffolder [183] Long reads DENTIST [184], LINKS [185], LRScaf [186], npScarf [187], PBJelly [188], RAILS [189], SLR [190],...…”
Section: Assembly Pre and Post-processingmentioning
confidence: 99%
“…Cobbler [189], DENTIST [184], FGAP [219], GMcloser [220], LR Gapcloser [221], PBJelly [188], PGcloser [222], TGS-GapCloser [223] thoroughly reviewed on multiple datasets [225]. When tested on Caenorhabditis elegans Nanopore reads, the error rate decreased from 28.93% to less than 1% (using Canu [114], CONSENT [153], FLAS [155], Jabba [148],…”
Section: Long Readsmentioning
confidence: 99%