2019
DOI: 10.1101/748228
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Chromosome-level hybrid de novo genome assemblies as an attainable option for non-model organisms

Abstract: 15The emergence of third generation sequencing (3GS; long-reads) is making closer the goal of 16 chromosome-size fragments in de novo genome assemblies. This allows the exploration of new 17 and broader questions on genome evolution for a number of non-model organisms. However, long-18 read technologies result in higher sequencing error rates and therefore impose an elevated cost of 19 sufficient coverage to achieve high enough quality. In this context, hybrid assemblies, combining 20 short-reads and long-read… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…Despite advances in sequencing technology that have led to dramatic improvements in draft genomes for non‐model species, it remains vital to evaluate the quality and completeness of genome assemblies. As noted above, well known deficiencies in genome assemblies include difficulty in assembling repetitive, duplicated, and GC rich regions that can often be addressed with long‐read sequencing (Sedlazeck et al, 2018), but at the expense of sequencing error which may influence estimates of gene content (Jaworski et al, 2020; Watson & Warr, 2019). The trade‐offs of various sequencing technologies can be offset by applying multiple platforms (Peona et al, 2021; Rhie, McCarthy, et al, 2020), however many chromosome‐level genome assemblies have extensive gaps within and among scaffolds that may hinder utility of these hybrid assemblies for subsequent studies (Domanska et al, 2018; Peona et al, 2021).…”
Section: Evaluating Assembliesmentioning
confidence: 99%
“…Despite advances in sequencing technology that have led to dramatic improvements in draft genomes for non‐model species, it remains vital to evaluate the quality and completeness of genome assemblies. As noted above, well known deficiencies in genome assemblies include difficulty in assembling repetitive, duplicated, and GC rich regions that can often be addressed with long‐read sequencing (Sedlazeck et al, 2018), but at the expense of sequencing error which may influence estimates of gene content (Jaworski et al, 2020; Watson & Warr, 2019). The trade‐offs of various sequencing technologies can be offset by applying multiple platforms (Peona et al, 2021; Rhie, McCarthy, et al, 2020), however many chromosome‐level genome assemblies have extensive gaps within and among scaffolds that may hinder utility of these hybrid assemblies for subsequent studies (Domanska et al, 2018; Peona et al, 2021).…”
Section: Evaluating Assembliesmentioning
confidence: 99%
“…Although LRs are becoming more widely used for de novo genome assembly, using hybrid approaches (that utilize a complementary SR dataset) is still popular for several reasons: (1) SRs have higher accuracy and can be generated by Illumina sequencers at a high throughput for a lower cost; (2) plenty of SR datasets are already publicly available for many genomes; (3) for some basic tasks such as variant calling (SNV and short indel detection), SRs still provide better resolution owing to their high accuracy, which often motivates researchers to generate SRs even when LRs are in hand; and (4) unlike PacBio assemblies whose accuracy increases with the depth of coverage thanks to their unbiased random error model ( Myers, 2014 ), constructing reference quality genomes solely from ONT reads remains challenging owing to biases in base calling, even with a high coverage ( Koren et al., 2017 ; Antipov et al., 2015 ). As a result, hybrid assembly approaches are still useful ( Jaworski et al., 2019 ; Jiang et al., 2019 ; Kadobianskyi et al., 2019 ).…”
Section: Introductionmentioning
confidence: 99%
“…Although LRs are becoming more widely used for de novo genome assembly, using hybrid approaches (that utilize a complementary SR dataset) is still popular for several reasons: (i) SRs have higher accuracy and can be generated by Illumina sequencers at a high throughput for a lower cost; (ii) plenty of SR datasets are already publicly available for many genomes; (iii) for some basic tasks such as variant calling (SNV and short indel detection), SRs still provide better resolution due to their high accuracy which often motivates researchers to generate SRs even when LRs are in hand; and (iv) unlike PacBio assemblies whose accuracy increases with the depth of coverage thanks to their unbiased random error model [23], constructing reference quality genomes solely from ONT reads remains challenging due to biases in base calling, even with a high coverage [14,1]. As a result, hybrid assembly approaches are still useful [8,9,10].…”
Section: Introductionmentioning
confidence: 99%