2011
DOI: 10.1038/nbt.1740
|View full text |Cite|
|
Sign up to set email alerts
|

Haplotype-resolved genome sequencing of a Gujarati Indian individual

Abstract: Haplotype information is essential to the complete description and interpretation of genomes 1 , genetic diversity 2 and genetic ancestry 3 . Although individual human genome sequencing is increasingly routine 4 , nearly all such genomes are unresolved with respect to haplotype. Here we combine the throughput of massively parallel sequencing 5 with the contiguity information provided by large-insert cloning 6 to experimentally determine the haplotype-resolved genome of a South Asian individual. A single fosmid… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
249
0

Year Published

2011
2011
2016
2016

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 221 publications
(250 citation statements)
references
References 33 publications
1
249
0
Order By: Relevance
“…Phasing was performed with PacBio long reads, Illumina short reads, 10X Genomics linked reads 4 (30× ), and reads from BACs representing a single haplotype (47× ). Heterozygous SNVs called from these methods are unambiguously assigned to two alternative phases, producing phased blocks with an N50 length of 11.6 Mb, considerably longer than previously reported 4,6,8,15,16 (Table 1). We assessed the accuracy of the phased blocks against the end sequences of BACs, and found a long-range switch error rate to be under 0.3%.…”
Section: Bac_168-g09mentioning
confidence: 73%
“…Phasing was performed with PacBio long reads, Illumina short reads, 10X Genomics linked reads 4 (30× ), and reads from BACs representing a single haplotype (47× ). Heterozygous SNVs called from these methods are unambiguously assigned to two alternative phases, producing phased blocks with an N50 length of 11.6 Mb, considerably longer than previously reported 4,6,8,15,16 (Table 1). We assessed the accuracy of the phased blocks against the end sequences of BACs, and found a long-range switch error rate to be under 0.3%.…”
Section: Bac_168-g09mentioning
confidence: 73%
“…1 Tables 5 and 6). Fosmid pooling has been used for re-sequencing 16,17 , and our results show that the combination of fosmid pooling, NGS and hierarchical assembly provides a new, cost-effective alternative for de novo sequencing and assembly of complex genomes.…”
Section: Sequencing and Hierarchical Assemblymentioning
confidence: 81%
“…All data from these validation experiments in the tumors (that is, confirmed somatic variants, false-positive and false-negative variants) are accessible in Supplementary Table 1. Functional annotation of SNVs. Several computational tools and databases were used to predict the functional effect of coding and noncoding SNVs, including known or predicted protein coding genes, miRNAs and their target sites 26 , TRANSFAC transcription factor binding sites 44 , OregANNO annotated regulatory sites 45 , Vista Enhancer sites 46 , conservation as indicated by the presence of GERP constraint elements 17 , phastcons conserved elements 47 and repeat elements. The effect of coding mutations was assessed using SIFT 48 , PolyPhen 49 and CanPredict 50 .…”
Section: Validation Of Somatic Missense Mutations In the Tumor-normalmentioning
confidence: 99%
“…Another commonly used approach is to apply quality filters that are aimed at selectively removing errors. Every whole-genome sequence reported so far has used filtering to some extent: the most commonly used filters being those that remove sequences with a too-low coverage depth, discard variants with a low-confidence score or eliminate variants located within a cluster of variants 3,7,[10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25] . Surprisingly, there is little consensus with respect to which filters should be used and at which threshold they should be applied.…”
mentioning
confidence: 99%
See 1 more Smart Citation