We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-1131-9) contains supplementary material, which is available to authorized users.
Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene–trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2 , known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits.
Leukocyte telomere length (LTL) is a heritable biomarker of genomic aging. In this study, we perform a genome-wide meta-analysis of LTL by pooling densely genotyped and imputed association results across large-scale European-descent studies including up to 78,592 individuals. We identify 49 genomic regions at a false dicovery rate (FDR) < 0.05 threshold and prioritize genes at 31, with five highlighting nucleotide metabolism as an important regulator of LTL. We report six genome-wide significant loci in or near SENP7 , MOB1B , CARMIL1 , PRRC2A , TERF2, and RFWD3 , and our results support recently identified PARP1, POT1 , ATM, and MPHOSPH6 loci. Phenome-wide analyses in >350,000 UK Biobank participants suggest that genetically shorter telomere length increases the risk of hypothyroidism and decreases the risk of thyroid cancer, lymphoma, and a range of proliferative conditions. Our results replicate previously reported associations with increased risk of coronary artery disease and lower risk for multiple cancer types. Our findings substantially expand current knowledge on genes that regulate LTL and their impact on human health and disease.
We tested whether DNA-methylation profiles account for inter-individual variation in body mass index (BMI) and height and whether they predict these phenotypes over and above genetic factors. Genetic predictors were derived from published summary results from the largest genome-wide association studies on BMI (n ∼ 350,000) and height (n ∼ 250,000) to date. We derived methylation predictors by estimating probe-trait effects in discovery samples and tested them in external samples. Methylation profiles associated with BMI in older individuals from the Lothian Birth Cohorts (LBCs, n = 1,366) explained 4.9% of the variation in BMI in Dutch adults from the LifeLines DEEP study (n = 750) but did not account for any BMI variation in adolescents from the Brisbane Systems Genetic Study (BSGS, n = 403). Methylation profiles based on the Dutch sample explained 4.9% and 3.6% of the variation in BMI in the LBCs and BSGS, respectively. Methylation profiles predicted BMI independently of genetic profiles in an additive manner: 7%, 8%, and 14% of variance of BMI in the LBCs were explained by the methylation predictor, the genetic predictor, and a model containing both, respectively. The corresponding percentages for LifeLines DEEP were 5%, 9%, and 13%, respectively, suggesting that the methylation profiles represent environmental effects. The differential effects of the BMI methylation profiles by age support previous observations of age modulation of genetic contributions. In contrast, methylation profiles accounted for almost no variation in height, consistent with a mainly genetic contribution to inter-individual variation. The BMI results suggest that combining genetic and epigenetic information might have greater utility for complex-trait prediction.
Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.