Genetic effects on gene expression and splicing can be modulated by cellular and environmental factors; yet interactions between genotypes, cell type and treatment have not been comprehensively studied together. We used an induced pluripotent stem cell system to study multiple cell types derived from the same individuals and exposed them to a large panel of treatments. Cellular responses involved different genes and pathways for gene expression and splicing, and were highly variable across contexts. For thousands of genes, we identified variable allelic expression across contexts and characterized different types of gene-environment interactions, many of which are associated with complex traits. Promoter functional and evolutionary features distinguished genes with elevated allelic imbalance mean and variance. On average half of the genes with dynamic regulatory interactions were missed by large eQTL mapping studies, indicating the importance of exploring multiple treatments to reveal previously unrecognized regulatory loci that may be important for disease.
While variance components analysis has emerged as a powerful tool in complex trait genetics, existing methods for fitting variance components do not scale well to large-scale datasets of genetic variation. Here, we present a method for variance components analysis that is accurate and efficient: capable of estimating one hundred variance components on a million individuals genotyped at a million SNPs in a few hours. We illustrate the utility of our method in estimating and partitioning variation in a trait explained by genotyped SNPs (SNPheritability). Analyzing 22 traits with genotypes from 300,000 individuals across about 8 million common and low frequency SNPs, we observe that per-allele squared effect size increases with decreasing minor allele frequency (MAF) and linkage disequilibrium (LD) consistent with the action of negative selection. Partitioning heritability across 28 functional annotations, we observe enrichment of heritability in FANTOM5 enhancers in asthma, eczema, thyroid and autoimmune disorders.
The proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method to estimate the variation in a complex trait that can be attributed to additive (additive heritability) and dominance deviation (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of additive and dominance heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency [MAF] > 1%) and 50 quantitative traits measured in 291,273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no statistically significant evidence for dominance heritability (p < 0:05=50 accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.
A central question in human genetics is to find the proportion of variation in a trait that can be explained by genetic variation. A number of methods have been developed to estimate this quantity, termed narrow-sense heritability, from genome-wide SNP data. Recently, it has become clear that estimates of narrow-sense heritability are sensitive to modeling assumptions that relate the effect sizes of a SNP to its minor allele frequency (MAF)and linkage disequilibrium (LD) patterns [3]. A principled approach to estimate heritability while accounting for variation in SNP effect sizes involves the application of linear Mixed Models (LMMs) with multiple variance components where each variance component represents the fraction of genetic variance explained by SNPs that belong to a given range of MAF and LD values. Beyond their importance in accurately estimating genome-wide SNP heritability, multiple variance component LMMs are useful in partitioning the contribution of genomic annotations to trait heritability which, in turn, can provide insights into biological processes that are associated with the trait.Existing methods for fitting multi-component LMMs rely on maximizing the likelihood of the variance components. These methods pose major computational bottlenecks that makes it challenging to apply them to large-scale genomic datasets such as the UK Biobank which contains half a million individuals genotyped at tens of millions of SNPs.We propose a scalable algorithm, RHE-reg-mc, to jointly estimate multiple variance components in LMMs. Our algorithm is a randomized method-of-moments estimator that has a runtime that is observed to scale as O( N M B max(log 3 (N ),log 3 (M )) +K 3 ) for N individuals, M SNPs, K variance components, and B ≈ 10 being a parameter that controls the number of random matrix-vector multiplication. RHE-regmc also efficiently computes standard errors. We evaluate the accuracy and scalability of RHE-reg-mc for estimating the total heritability as well as in partitioning heritability. The ability to fit multiple variance components to SNPs partitioned according to their MAF and local LD allows RHE-reg-mc to obtain relatively unbiased estimates of SNP heritability under a wide range of models of genetic architecture. On the UK Biobank dataset consisting of ≈ 300, 000 individuals and ≈ 500, 000 SNPs, RHE-reg-mc can fit 250 variance components, corresponding to genetic variance explained by 1 MB blocks, in ≈ 40 minutes on standard hardware.
Genetic effects on gene expression and splicing can be modulated by cellular and environmental factors; yet interactions between genotypes, cell type and treatment have not been comprehensively studied together. We used an induced pluripotent stem cell system to study multiple cell types derived from the same individuals and exposed them to a large panel of treatments. Cellular responses involved different genes and pathways for gene expression and splicing processes, and were also highly variable across cell types and treatments. For thousands of genes, we identified variable allelic expression across contexts, and characterized different types of gene-environment interactions. Many of these G×E genes are associated with complex traits. We characterized promoter functional and evolutionary features that distinguish genes with elevated allelic imbalance mean and variance. More than 47% of the genes with dynamic regulatory interactions were missed by GTEx, but we identified them using a suitable allelic imbalance study design. This indicates the importance of exploring multiple treatments to reveal previously unrecognized regulatory loci that may be important for disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.