Protein-truncating variants can have profound effects on gene function and are critical for clinical genome interpretation and generating therapeutic hypotheses, but their relevance to medical phenotypes has not been systematically assessed. We characterized the effect of 18,228 proteintruncating variants across 135 phenotypes from the UK Biobank and found 27 associations between medical phenotypes and protein-truncating variants in genes outside the major histocompatibility complex. We performed phenome-wide analyses and directly measured the effect of homozygous carriers, commonly referred to as "human knockouts," across medical phenotypes for genes implicated to be protective against disease or associated with at least one phenotype in our study and found several genes with strong pleiotropic or non-additive effects. Our results illustrate the importance of protein-truncating variants in a variety of diseases.Protein-truncating variants (PTVs), genetic variants predicted to shorten the coding sequence of genes, are a promising set of variants for drug discovery since identification of PTVs that protect against human disease provides in vivo validation of therapeutic targets 1,2,3,4 . Although tens of thousands of standing germline PTVs have been identified 5,6 , their medical relevance across a broad range of phenotypes has not been characterized. Because most PTVs are present at low frequency, assessing the effects of PTVs requires genotype data from a large number of individuals with linked phenotype data for a variety of diseases and physiological measurements. The recent release of genotype and linked clinical and questionnaire data for 488,377 individuals in the UK Biobank provides an unprecedented opportunity to assess the clinical impact of truncating protein-coding genes at a resolution not previously possible.. CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/179762 doi: bioRxiv preprint first posted online Aug. 23, 2017;
ResultsTo assess the clinical relevance of PTVs, we cataloged predicted PTVs present in the Affymetrix UK Biobank array and their effects on medical phenotypes from 337,208 unrelated individuals in the UK Biobank study 7,8 . We defined PTVs as single-nucleotide variants (SNVs) predicted to introduce a premature stop codon or to disrupt a splice site or small insertions or deletions (indels) predicted to disrupt a transcript's reading frame 5 . We identified 18,228 predicted PTVs in the UK Biobank array that were polymorphic across 8,750 genes after filtering (Methods, Figure S1). Each participant had 95 predicted PTVs with minor allele frequency (MAF) less than 1% on average, and 778 genes were predicted to be homozygous or compound heterozygous for PTVs with MAF less than 1% in at least one individual. The observed number of PTVs per individual is consistent with the ~100 loss-of-function variants observed in the 1000 Genomes project 9 . In contrast...