2019
DOI: 10.1111/1755-0998.13020
|View full text |Cite
|
Sign up to set email alerts
|

Improving Illumina assemblies with Hi‐C and long reads: An example with the North African dromedary

Abstract: Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate‐pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi‐C and Dovetail Genomics Chicago libraries and long‐read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
77
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 50 publications
(77 citation statements)
references
References 73 publications
0
77
0
Order By: Relevance
“…Pacific Biosciences (PacBio) sequencing is the most commonly used third-generation sequencing technology generating long reads, its accuracy remains inferior to that of the second-generation Illumina sequencing technology (Tedersoo, Tooming-Klunderud & Anslan, 2018). Therefore, increasing attention has been paid to the combination of PacBio and Illumina techniques for genome assembly (Elbers et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Pacific Biosciences (PacBio) sequencing is the most commonly used third-generation sequencing technology generating long reads, its accuracy remains inferior to that of the second-generation Illumina sequencing technology (Tedersoo, Tooming-Klunderud & Anslan, 2018). Therefore, increasing attention has been paid to the combination of PacBio and Illumina techniques for genome assembly (Elbers et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…In the CamDro3 annotation, we predicted 22,917 genes that produced 34,135 proteins, and 7.4 % (1,705) of genes had no assigned annotation. These numbers are slightly higher than for the CamDro2 assembly for which we had predicted 22,534 genes that produced 34,024 proteins, and 7.7 % (1,730) of genes had no assigned annotation [11]. We assessed if predicted proteins were truncated due to uncorrected indels introduced by PacBio reads by comparing the predicted protein length hit distribution of the CamDro1 assembly (Illumina only data, Figure 1, red line), which should lack such PacBio speci c error, to that of the CamDro2 (Figure 1, green line) and CamDro3 assemblies ( Figure 1, blue line).…”
Section: Improved Camelus Dromedarius Genome Assemblymentioning
confidence: 59%
“…Next, the Chicago assembly was scaffolded with Hi-C data. Using a PacBio Sequel sequencer, 11x long-read coverage were generated ( [11]; Sequence Read Archive (SRA) accession: SRP050586) and PBJelly [39] was used to ll in gaps in the Hi-C assembly. PBJelly assembly was polished with Pilon [40] employing the same trimmed and errorcorrected Illumina short-insert sequences used for the de novo assembly of CamDro1 by Fitak et al ( [8]; SRA accession: SRR2002493).…”
Section: Previous Dromedary Genome Assembliesmentioning
confidence: 99%
“…The PacBio assembly comprised 4,402 contigs totalling 2.09 Gb, which possessed an N50 of 5.37 Mb (Table 1). The continuity of our PacBio assembly was not only dramatically improved compared to that of CB1 and MBC1 ( Figure S2), but was also better than some other genome assemblies that were based on long-read sequencing (Bickhart et al, 2017;Gordon et al, 2016;Jain et al, 2018), including the improved version of the dromedary genome (CamDro2) (Elbers et al, 2019). To reduce the base errors within the assembly, we polished the contigs using both the PacBio reads and Illumina reads.…”
Section: Contig Assembly and Polishmentioning
confidence: 99%
“…Additionally, new scaffolding technology such as chromatin interaction mapping (Hi-C) allowed the contigs to be assembled to the scale of full chromosomes (Bickhart et al, 2017;Burton et al, 2013). Recently, Hi-C technology and low-coverage long-read sequencing (11×) were applied to upgrade the short-read assembly of the dromedary (Elbers et al, 2019). Based on the upgraded assembly, the genomic organization of natural killer cell receptor genes in camels was characterized (Futas et al, 2019).…”
Section: Introductionmentioning
confidence: 99%