We analyzed the mRNA levels for 36,778 transcript expression traits (probes) from 2,765 individuals to comprehensively investigate the genetic architecture and degree of missing heritability for gene expression in peripheral blood. We identified 11,204 cis and 3,791 trans independent expression quantitative trait loci (eQTL) by using linear mixed models to perform genome-wide association analyses. Furthermore, using information on both closely and distantly related individuals, heritability was estimated for all expression traits. Of the set of expressed probes (15,966), 10,580 (66%) had an estimated narrow-sense heritability (h) greater than zero with a mean (median) value of 0.192 (0.142). Across these probes, on average the proportion of genetic variance explained by all eQTL (h) was 31% (0.060/0.192), meaning that 69% is missing, with the sentinel SNP of the largest eQTL explaining 87% (0.052/0.060) of the variance attributed to all identified cis- and trans-eQTL. For the same set of probes, the genetic variance attributed to genome-wide common (MAF > 0.01) HapMap 3 SNPs (h) accounted for on average 48% (0.093/0.192) of h. Taken together, the evidence suggests that approximately half the genetic variance for gene expression is not tagged by common SNPs, and of the variance that is tagged by common SNPs, a large proportion can be attributed to identifiable eQTL of large effect, typically in cis. Finally, we present evidence that, compared with a meta-analysis, using individual-level data results in an increase of approximately 50% in power to detect eQTL.
In the originally published version of this article, 0.05 in the Table 1 legend should have been ''0.05 or 0.005,'' and 0.005 in the Table 2 legend should have been 0.05. Also, on page 5, the quality-control (QC) procedure has been clarified to indicate that it was performed by the QC working group of the Alzheimer Disease Sequencing Project instead of the authors. These errors have been corrected online and in print. The Journal apologizes for the errors and any confusion they may have caused.
We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32–44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.
Expression quantitative trait locus (eQTL) detection has emerged as an important tool for unraveling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and the magnitude of effects. We describe a series of simulation studies designed to evaluate the impact of linkage disequilibrium (LD) on the fine mapping of causal variants with typical eQTL effect sizes. In the presence of multisite regulation, even though between 80 and 90% of modeled eSNPs associate with normally distributed traits, up to 10% of all secondary signals could be statistical artifacts, and at least 5% but up to one-quarter of credible intervals of SNPs within r2 > 0.8 of the peak may not even include a causal site. The Bayesian methods eCAVIAR and DAP (Deterministic Approximation of Posteriors) provide only modest improvement in resolution. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine mapping of causal variants needs to be adjusted for multisite influences, as conditional estimates can be highly biased by interference among linked sites, but ultimately experimental verification of individual effects is needed. Presumably similar conclusions apply not just to eQTL mapping, but to multisite influences on fine mapping of most types of quantitative trait.
Transcript co-expression is regulated by a combination of shared genetic and environmental factors. Here, we estimate the proportion of co-expression that is due to shared genetic variance. To do so, we estimated the genetic correlations between each pairwise combination of 2469 transcripts that are highly heritable and expressed in whole blood in 1748 unrelated individuals of European ancestry. We identify 556 pairs with a significant genetic correlation of which 77% are located on different chromosomes, and report 934 expression quantitative trait loci, identified in an independent cohort, with significant effects on both transcripts in a genetically correlated pair. We show significant enrichment for transcription factor control and physical proximity through chromatin interactions as possible mechanisms of shared genetic control. Finally, we construct networks of interconnected transcripts and identify their underlying biological functions. Using genetic correlations to investigate transcriptional co-regulation provides valuable insight into the nature of the underlying genetic architecture of gene regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.