2020
DOI: 10.1186/s13059-019-1885-y
|View full text |Cite
|
Sign up to set email alerts
|

Performance difference of graph-based and alignment-based hybrid error correction methods for error-prone long reads

Abstract: The error-prone third-generation sequencing (TGS) long reads can be corrected by the high-quality second-generation sequencing (SGS) short reads, which is referred to as hybrid error correction. We here investigate the influences of the principal algorithmic factors of two major types of hybrid error correction methods by mathematical modeling and analysis on both simulated and real data. Our study reveals the distribution of accuracy gain with respect to the original long read error rate. We also demonstrate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 34 publications
0
8
0
Order By: Relevance
“…SGS has the advantages of low cost, high accuracy and short sequencing time, but shorter sequence read lengths; TGS has longer sequence reads but higher error rates [ 47 ]. As a result, more and more studies on the response to plant stress are now beginning to make use of combined SGS and TGS sequencing methods, which can provide a more complete and high-quality assembly at the transcriptome level [ 48 – 50 ].…”
Section: Discussionmentioning
confidence: 99%
“…SGS has the advantages of low cost, high accuracy and short sequencing time, but shorter sequence read lengths; TGS has longer sequence reads but higher error rates [ 47 ]. As a result, more and more studies on the response to plant stress are now beginning to make use of combined SGS and TGS sequencing methods, which can provide a more complete and high-quality assembly at the transcriptome level [ 48 – 50 ].…”
Section: Discussionmentioning
confidence: 99%
“…A high quality de novo genome assembly using only ONT reads would require high sequencing quality on base accuracy, long read length, and great sequencing depth (Tyson et al, 2018; Vaser et al, 2017), increasing the sequencing cost. By introducing the high accuracy short reads to the assembly polishing, most errors could be corrected as long as the long reads error rate was below 15 % (Wang and Au, 2020), and the requirement on long read coverage can be decreased. The improvement on sequence accuracy provides great advantages to the BUSCO score and gene annotation (Johnson et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Moreover, long reads can span the entire tandems of repeats in the genome, resolve the complex regions and improve contiguity that short reads could not achieved (Shin et al, 2019; Tyson et al, 2018). On the other hand, short read polishing could make up the problem of high error rate in long read assembly and cut down the requirement of sequencing depth (Wang and Au, 2020). Therefore, hybrid assembly method greatly improves the genome assembly quality and cuts down the unit price of data generation (Díaz-Viraqué et al, 2019; Miller et al, 2017; Tan et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Previous papers already covered this topic. On the one hand, [24] and [70] focused on hybrid correction. The former proposed a benchmark of ten tools on five datasets, whereas the latter studied performances of graph-based and alignment-based methods.…”
Section: Contributionmentioning
confidence: 99%