Transcriptome-wide association studies (TWASs) have been widely used to integrate gene expression and genetic data for studying complex traits. Due to the computational burden, existing TWAS methods do not assess distant trans-expression quantitative trait loci (eQTL) that are known to explain important expression variation for most genes. We propose a Bayesian genome-wide TWAS (BGW-TWAS) method that leverages both cis-and trans-eQTL information for a TWAS. Our BGW-TWAS method is based on Bayesian variable selection regression, which not only accounts for cis-and trans-eQTL of the target gene but also enables efficient computation by using summary statistics from standard eQTL analyses. Our simulation studies illustrated that BGW-TWASs achieved higher power compared to existing TWAS methods that do not assess trans-eQTL information. We further applied BWG-TWAS to individual-level GWAS data (N ¼ $3.3K), which identified significant associations between the genetically regulated gene expression (GReX) of ZC3H12B and Alzheimer dementia (AD) (p value ¼ 5.42 3 10 À13 ), neurofibrillary tangle density (p value ¼ 1.89 3 10 À6 ), and global measure of AD pathology (p value ¼ 9.59 3 10 À7 ). These associations for ZC3H12B were completely driven by trans-eQTL. Additionally, the GReX of KCTD12 was found to be significantly associated with b-amyloid (p value ¼ 3.44 3 10 À8 ) which was driven by both cis-and trans-eQTL. Four of the top driven trans-eQTL of ZC3H12B are located within APOC1, a known major risk gene of AD and blood lipids. Additionally, by applying BGW-TWAS with summary-level GWAS data of AD (N ¼ $54K), we identified 13 significant genes including known GWAS risk genes HLA-DRB1 and APOC1, as well as ZC3H12B.
Transcriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, traditional two-stage TWAS methods first impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on reference transcriptome. Traditional TWAS methods then employ a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL’s estimated effect on reference transcriptome. To increase TWAS robustness to this assumption, we propose a novel Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding reference cis-eQTL effects) rather than fixed. VC-TWAS is applicable to both continuous and dichotomous phenotypes, as well as individual-level and summary-level GWAS data. Using simulated data, we show VC-TWAS is more powerful than traditional TWAS methods based on a two-stage Burden test, especially when eQTL genetic effects on test phenotype are no longer a linear function of their eQTL genetic effects on reference transcriptome. We further applied VC-TWAS to both individual-level (N = ~3.4K) and summary-level (N = ~54K) GWAS data to study Alzheimer’s dementia (AD). With the individual-level data, we detected 13 significant risk genes including 6 known GWAS risk genes such as TOMM40 that were missed by traditional TWAS methods. With the summary-level data, we detected 57 significant risk genes considering only cis-SNPs and 71 significant genes considering both cis- and trans- SNPs; these findings also validated our findings with the individual-level GWAS data. Our VC-TWAS method is implemented in the TIGAR tool for public use.
Transcriptome-wide association studies (TWAS) have been widely used to integrate gene expression and genetic data for studying complex traits. Due to the computation burden, existing TWAS methods neglect distant trans-expression quantitative trait loci (eQTL) that are known to explain a significant proportion of the variation for most expression quantitative traits.To leverage both cis-and trans-eQTL information for TWAS, we propose a novel TWAS approach based on Bayesian variable selection regression model, which not only accounts for both cis-and trans-SNPs of the target gene but also enables efficient computation by using summary statistics of standard eQTL analyses and a scalable EM-MCMC algorithm. Simulation studies illustrate that our Bayesian approach achieved higher TWAS power compared to existing methods. By application studies, we identified gene ZC3H12B whose GReX is associated with both Alzheimer's dementia (AD) (p-value=2.15) and a global measure of AD pathology (p-value=2.438, which is completely driven by trans-eQTL. We also identified gene KCTD12 whose GReX is associated with ߚ -amyloid load (p-value=7.63which is driven by both cis-and trans-eQTL. Particularly, four of the top driven trans-eQTL of ZC3H12B are located in gene APOC1 (<12KB away from the well-known risk gene APOE of AD) and are also known GWAS signals of AD and blood lipids. Free software for implementing our proposed Bayesian TWAS approach is available on Github.
Transcriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, existing TWAS methods impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on transcriptome estimated from reference datasets. Existing TWAS methods then apply a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL's estimated effect on reference transcriptome. Thus, existing TWAS methods make a strong assumption that cis-eQTL effect sizes on reference transcriptome are reflective of their corresponding SNP effect sizes on test phenotype. To increase TWAS robustness to this assumption, we propose a Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding cis-eQTL effects in reference dataset) rather than fixed. By doing so, we show VC-TWAS is more powerful than traditional TWAS when cis-eQTL SNP effects on test phenotype truly differ from their eQTL effects within reference dataset. We further applied VC-TWAS using cis-eQTL effect sizes estimated by a nonparametric Bayesian method to study Alzheimer's dementia (AD) related phenotypes and detected 13 genes significantly associated with AD, including 6 known GWAS risk loci. All significant loci are proximal to the major known risk loci APOE for AD. Further, we add this VC-TWAS function into our previously developed tool TIGAR for public use.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.