2017
DOI: 10.1101/110999
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

W2RAP: a pipeline for high quality, robust assemblies of large complex genomes from short read data

Abstract: Producing high-quality whole-genome shotgun de novo assemblies from plant and animal species with large and complex genomes using low-cost short read sequencing technologies remains a challenge. But when the right sequencing data, with appropriate quality control, is assembled using approaches focused on robustness of the process rather than maximization of a single metric such as the usual contiguity estimators, good quality assemblies with informative value for comparative analyses can be produced. Here we p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
32
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 34 publications
(32 citation statements)
references
References 15 publications
0
32
0
Order By: Relevance
“…Here we report the most complete wheat genome assembly to date, representing almost 80% of the 17Gbp genome in large scaffolds. We combined high-quality PCR-free libraries and precisely size-selected LMP libraries (Heavens et al, 2015) with the w2rap assembly software (Clavijo, 2016) to generate contiguous and complete assemblies from relatively low (∼33×) Illumina paired-end Table 4: Disease Resistance and Gluten gene repertoires in the TGACv1 assembly. Resistance genes were identified by their characteristic domain architecture (Sarris et al, 2016).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Here we report the most complete wheat genome assembly to date, representing almost 80% of the 17Gbp genome in large scaffolds. We combined high-quality PCR-free libraries and precisely size-selected LMP libraries (Heavens et al, 2015) with the w2rap assembly software (Clavijo, 2016) to generate contiguous and complete assemblies from relatively low (∼33×) Illumina paired-end Table 4: Disease Resistance and Gluten gene repertoires in the TGACv1 assembly. Resistance genes were identified by their characteristic domain architecture (Sarris et al, 2016).…”
Section: Discussionmentioning
confidence: 99%
“…Nearly 3 million contigs (of length greater than 500bp) were generated using the w2rap-contigger (Clavijo, 2016) with an N50 of 16.7kbp ( Supplemental Table S4.3). After scaffolding using SOAPdenovo (Luo et al, 2012), the assembly contained 1.3 million sequences with an N50 of 83.9kbp.…”
Section: Genome Assemblymentioning
confidence: 99%
See 1 more Smart Citation
“…DNA samples were fragmented to ~450 bp and sequenced to at least 60X coverage using paired-end, 250 bp reads on the Illumina Hi-Seq 2500 according to DISCOVAR de novo protocol 20 . The w2rap-contigger 18 was then used to assemble contigs (Table S1.2). Heterozygous genomes such as these yield complex assembly graphs where loci fail to collapse into a single representation but are expanded into two alternative alleles ( Figure S1.1, left side).…”
Section: Genome Assembly With Discovar De Novo and W2rapmentioning
confidence: 99%
“…We used an innovative short-read assembly method (w2rap 18 , an extension of DISCOVAR de novo 19,20 ) to generate 20 new reference genome assemblies for species sampled from both major Heliconius sub-clades and three additional genera of Heliconiini (Supplementary Information Section 1). This strategy relies on high fidelity PCR-free Illumina sequencing, and is particularly powerful for low-complexity regions 21 .…”
mentioning
confidence: 99%