2019
DOI: 10.1101/831248
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TGS-GapCloser: fast and accurately passing through the Bermuda in large genome using error-prone third-generation long reads

Abstract: The completeness and accuracy of genome assemblies determine the quality of subsequent bioinformatics analysis. Despite benefiting from the medium/long-range information of third-generation sequencing techniques, current gap-closing tools to enhance assemblies suffer multi-alignments and high error rates, resulting in huge time and money costs. We developed a software tool, TGS-GapCloser that uses the low depth (>=10X) single molecule sequencing long reads without any error correction to close gaps. The algori… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
2

Relationship

4
4

Authors

Journals

citations
Cited by 21 publications
(20 citation statements)
references
References 41 publications
0
20
0
Order By: Relevance
“…To further improve the integrity and accuracy of the genome assembly, we employed tgs‐gapcloser , which uses low‐depth (≥10×) single‐molecule sequencing long reads without any error correction to close gaps in the draft assembly (Xu et al, 2019). The long sequences were split into three groups, including total reads (with options –min_idy 0.2, –min_match 200 –r_round 1), reads with length ≥20 kb (with options –min_idy 0 –min_match 0 –r_round 3) and reads with length 2–20 kb (with options –min_idy 0 –min_match 0 –r_round 3), and each group was used to fill the corresponding aligned gaps.…”
Section: Methodsmentioning
confidence: 99%
“…To further improve the integrity and accuracy of the genome assembly, we employed tgs‐gapcloser , which uses low‐depth (≥10×) single‐molecule sequencing long reads without any error correction to close gaps in the draft assembly (Xu et al, 2019). The long sequences were split into three groups, including total reads (with options –min_idy 0.2, –min_match 200 –r_round 1), reads with length ≥20 kb (with options –min_idy 0 –min_match 0 –r_round 3) and reads with length 2–20 kb (with options –min_idy 0 –min_match 0 –r_round 3), and each group was used to fill the corresponding aligned gaps.…”
Section: Methodsmentioning
confidence: 99%
“…We assembled the genome into 4,244 scaffolds which span ∼669.73 Mb (99.45% of the estimated genome size 673.41Mb) with an ultra-long scaffold N50 of ∼9.62 Mb. To improve the continuity of this assembly, we sequenced more 10.17 Gb (∼13.14-fold) single molecular long reads using Nanopore sequencing platform, resulting in a notable increasing of contig N50 value from 255.61 Kb to 2.31 Mb using TGS-GapCloser[20] ( Supplemental table 2 ). Based on the high quality draft genome assembly, we also sequenced 21.23 Gb data of a Hi-C library and anchored 647.59 Mb (∼96.80% of the whole assembly) scaffold sequences onto 20 chromosomes ( Fig.…”
Section: Resultsmentioning
confidence: 99%
“…In this process, the stLFR reads were first pre-processed to be compatibly handled by supernova assembler, using the stLFR2Supernova pipeline (https://github.com/BGI-Qingdao/stlfr2supernova_pipeline). Then, we enhanced the draft assembly using TGS-GapCloser pipeline[20] based on the single molecular long reads.…”
Section: Methodsmentioning
confidence: 99%
“…To further improve the continuity, we sequenced ∼10.2 Gb (∼13.1×) Nanopore long reads to fill the gaps. With these long reads, the contig N50 was further improved from 255.6 Kb to 2.31 Mb ( Table S2 ) using TGS-GapCloser ( Xu et al., 2019 ). To anchor the scaffold sequences to chromosomes, we constructed a Hi-C library and sequenced ∼21.2 Gb Hi-C data and thus ∼650.44 Mb sequences were anchored to 20 chromosomes ( Figures 1 A, S2 , and Table S3 ), which was consistent with the previous report on African arowana karyotype ( Oliveira et al., 2019 ).…”
Section: Resultsmentioning
confidence: 99%