2021
DOI: 10.1038/s41586-021-03451-0
|View full text |Cite
|
Sign up to set email alerts
|

Towards complete and error-free genome assemblies of all vertebrate species

Abstract: High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assembli… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

15
671
1
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 1,720 publications
(754 citation statements)
references
References 110 publications
15
671
1
1
Order By: Relevance
“…The VGP version 1 assembly pipeline developed for the nuclear genome uses Continuous Long Reads (CLR) and the Pacific Biosciences (PacBio) assembler FALCON to generate contigs [35,36]. When initially inspecting the contigs, we noted the absence of contigs representing the mitogenome in 75% of species.…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…The VGP version 1 assembly pipeline developed for the nuclear genome uses Continuous Long Reads (CLR) and the Pacific Biosciences (PacBio) assembler FALCON to generate contigs [35,36]. When initially inspecting the contigs, we noted the absence of contigs representing the mitogenome in 75% of species.…”
Section: Resultsmentioning
confidence: 99%
“…The high level of similarity is supportive of the overall quality of the mitoVGP assembly, which is also confirmed by its Q44.30 base call accuracy and 100% identity to the NOVOPlasty assembly in non-repetitive regions. In the kakapo (Strigops habroptilus), an entire 2.3-kbp CR region, including a~925-bp-long repeat (repeat unit = 84 bp), was also missing from the RefSeq sequence (long-range PCR and direct Sanger sequencing [35,43], Fig. 2c).…”
Section: Novel Duplications Repeats and Heteroplasmymentioning
confidence: 99%
See 3 more Smart Citations