2020
DOI: 10.1038/s41586-020-2853-0
|View full text |Cite
|
Sign up to set email alerts
|

Exome sequencing and characterization of 49,960 individuals in the UK Biobank

Abstract: The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Ne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

11
412
4

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

3
6

Authors

Journals

citations
Cited by 438 publications
(461 citation statements)
references
References 61 publications
11
412
4
Order By: Relevance
“…It is a component of the ECM in muscle, vessels and skin—tissues which display the most prominent features of type VI collagenopathies. Recent exome sequence analysis in UKBB 46 reported significant large effects on CRF of a burden of rare loss of function coding variants in COL6A1; rs8127032 could shed light on cis-regulatory control of this gene. Drawing on the ever-increasing regulatory annotations of the genome, particularly on experimentally defined trans-acting factors binding sites, could further shape precise hypotheses to be tested.…”
Section: Discussionmentioning
confidence: 99%
“…It is a component of the ECM in muscle, vessels and skin—tissues which display the most prominent features of type VI collagenopathies. Recent exome sequence analysis in UKBB 46 reported significant large effects on CRF of a burden of rare loss of function coding variants in COL6A1; rs8127032 could shed light on cis-regulatory control of this gene. Drawing on the ever-increasing regulatory annotations of the genome, particularly on experimentally defined trans-acting factors binding sites, could further shape precise hypotheses to be tested.…”
Section: Discussionmentioning
confidence: 99%
“…We compared the correlation of genotypes between the exome-sequencing data released by the UK Biobank (following their SPB pipeline 113 ) and the TOPMed-imputed genotypes. The comparison assessed 49,819 individuals and 3,052,260 autosomal variants that were found in both the exome-sequencing and TOPMed-imputed datasets (matched by chromosome, position and alleles, and with an imputation quality of at least 0.3 in the TOPMed-imputed data).…”
Section: Methodsmentioning
confidence: 99%
“…Variants were called on each CRAM with DeepVariant 5 0.10.0 using a deep learning model retrained on exome data sequenced with the same protocol as was used to sequence the UK Biobank samples 8 . Variant calls were restricted to the exome capture region and the 100 base-pairs flanking each capture target, resulting in a gVCF (genomic VCF) for each sample containing all variant genotypes and compressed representations of reference regions without called variant genotypes.…”
Section: Methodsmentioning
confidence: 99%
“…Two sets of NovaSeq exome sequence data were generated from the HG002 control sample 10 via the exome sequencing protocol applied to UK Biobank samples 8 and then mapped via the OQFE protocol. Two additional CRAMs were created from each HG002 OQFE CRAM by recalibrating the base qualities (+BQSR CRAM) and then applying the FE binning strategy (+BQSR+FEbin CRAM) as described in the FE protocol 4 .…”
Section: Methodsmentioning
confidence: 99%