2019
DOI: 10.1101/705616
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Human Genome Assembly in 100 Minutes

Abstract: De novo genome assembly provides comprehensive, unbiased genomic information and makes it possible to gain insight into new DNA sequences not present in reference genomes. Many de novo human genomes have been published in the last few years, leveraging a combination of inexpensive short-read and single-molecule long-read technologies. As long-read DNA sequencers become more prevalent, the computational burden of generating assemblies persists as a critical factor. The most common approach to long-read assembly… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
113
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 101 publications
(114 citation statements)
references
References 29 publications
1
113
0
Order By: Relevance
“…We first evaluated HiCanu on a 24 kbp HiFi library from a Drosophila melanogaster F1 hybrid (ISO1×A4) (Data Availability). To match typical coverage, the HiFi dataset was downsampled to 40× and assembled with the HiFi-specific tools, HiCanu and Peregrine (Chin and Khalak 2019), as well as the conventional long-read assembler, Canu. Canu was chosen as it was previously shown to achieve the highest assembly continuity and superior repeat resolution among other popular long-read assemblers on HiFi data (Wenger et al 2019).…”
Section: Drosophila Genome Assemblymentioning
confidence: 99%
See 2 more Smart Citations
“…We first evaluated HiCanu on a 24 kbp HiFi library from a Drosophila melanogaster F1 hybrid (ISO1×A4) (Data Availability). To match typical coverage, the HiFi dataset was downsampled to 40× and assembled with the HiFi-specific tools, HiCanu and Peregrine (Chin and Khalak 2019), as well as the conventional long-read assembler, Canu. Canu was chosen as it was previously shown to achieve the highest assembly continuity and superior repeat resolution among other popular long-read assemblers on HiFi data (Wenger et al 2019).…”
Section: Drosophila Genome Assemblymentioning
confidence: 99%
“…Many of these so-called "challenge" BACs were deliberately selected from genomic regions which pose significant assembly challenges (i.e. regions with segmental duplications), making them useful for assembly benchmarking (Chin and Khalak 2019;Shafin et al 2019;Miga et al 2019;Vollger et al 2020). Table 3 summarizes how well the challenge BACs are captured by different assemblies.…”
Section: Human Genome Assembliesmentioning
confidence: 99%
See 1 more Smart Citation
“…It is possible that some MHC reads from HG002 are missed if they come from parts in the MHC region where HG002 is very different from the primary GRCh37 MHC region. In order to catch all possible reads that indeed belong to the MHC region of HG002, we also generated a de novo assembly of the HG002 MHC region using the Peregrine Assembler 22 and extracted reads that map to the de novo assembled contigs as unphased reads.…”
Section: Recruiting Wgs Reads For the Mhc Region Of Hg002mentioning
confidence: 99%
“…These results are comparable with the previous study 3 that used family trio information to haplotag 79.2% of HiFi PacBio reads. We next assembled haplotype-specific reads into completely phased de novo assemblies using one of the most popular assemblers, Canu 26 , and the recently described fast assembler for HiFi data, Peregrine 27 . While Peregrine generated more contiguous genome assemblies (N50 contig: H1: 28 Mbp, H2: 29.1 Mbp) compared to Canu (H1: 9.9 Mbp, H2: 10.7 Mbp), we noted more misassemblies, especially chimeric contigs, near the end of contigs ( Supplementary Table 1).…”
mentioning
confidence: 99%