2019
DOI: 10.1101/848366
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Long read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits

Abstract: Long-read sequencing (LRS) promises to improve characterization of structural variants (SVs), a major source of genetic diversity. We generated LRS data on 1,817 Icelanders using Oxford Nanopore Technologies, and identified a median of 23,111 autosomal structural variants per individual (a median of 11,506 insertions and 11,576 deletions), spanning cumulatively a median of 9.9 Mb. We found that rare SVs are larger in size than common ones and are more likely to impact protein function. We discovered an associa… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
69
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 55 publications
(74 citation statements)
references
References 80 publications
5
69
0
Order By: Relevance
“…Recent studies indicated that large structural variants can be identified accurately from genome graphs [32,[53][54][55]. Eventually, a bovine genome graph that unifies multiple breed-specific haplotype-resolved genome assemblies and their sites of variation might provide access to sources of variation that are currently neglected when short sequencing reads are aligned to a linear reference sequence [56,57].…”
Section: Discussionmentioning
confidence: 99%
“…Recent studies indicated that large structural variants can be identified accurately from genome graphs [32,[53][54][55]. Eventually, a bovine genome graph that unifies multiple breed-specific haplotype-resolved genome assemblies and their sites of variation might provide access to sources of variation that are currently neglected when short sequencing reads are aligned to a linear reference sequence [56,57].…”
Section: Discussionmentioning
confidence: 99%
“…Moreover, the low throughput of modern lrWGS platforms renders them impractical for adoption in most large-scale population studies. The largest published assembly-based PacBio study has analyzed just 15 genomes, 22 while a recent study from Iceland analyzed 1,817 ONT genomes, 24 by comparison to millions of genomes that have already been sequenced or commissioned using srWGS. Given this predominance of srWGS in the current landscape of genomics research, we present here a series of analyses from the HGSVC to: (i) define and quantify the limitations of SV detection from srWGS; (ii) benchmark expectations for the number and class of variants that can be reliably detected from srWGS; (iii) predict the genomic features that drive false positive and false negative discoveries for each technology; and (iv) establish the scientific and clinical advances offered by state-of-the-art lrWGS assembly as a complementary approach to srWGS.…”
Section: Main Textmentioning
confidence: 99%
“…After a detailed quality control analysis ( Figure S1) 83,486 SVs were identified, consistent with previous reports using LR-WGS ( Figure S2). 11 Focusing on rare variants (allele count <= 10 in gnomAD v3, NIHR BioResource and NGC project) 14; 17; 18 in SERPINC1 and flanking regions, 10 candidate heterozygous SVs were observed in 9 individuals ( Figure 1C). Visual inspection of read alignments identified an additional heterozygous SV in a region of low coverage involving SERPINC1.…”
Section: Main Textmentioning
confidence: 99%