BackgroundPart of the missing heritability in Genome Wide Association Studies (GWAS) is expected to be explained by interactions between genetic variants, also called epistasis. Various statistical methods have been developed to detect epistasis in case-control GWAS. These methods face major statistical challenges due to the number of tests required, the complexity of the Linkage Disequilibrium (LD) structure, and the lack of consensus regarding the definition of epistasis. Their limited impact in terms of uncovering new biological knowledge might be explained in part by the limited amount of experimental data available to validate their statistical performances in a realistic GWAS context. In this paper, we introduce a simulation pipeline for generating real scale GWAS data, including epistasis and realistic LD structure. We evaluate five exhaustive bivariate interaction methods, fastepi, GBOOST, SHEsisEpi, DSS, and IndOR. Two hundred thirty four different disease scenarios are considered in extensive simulations. We report the performances of each method in terms of false positive rate control, power, area under the ROC curve (AUC), and computation time using a GPU. Finally we compare the result of each methods on a real GWAS of type 2 diabetes from the Welcome Trust Case Control Consortium.ResultsGBOOST, SHEsisEpi and DSS allow a satisfactory control of the false positive rate. fastepi and IndOR present an increase in false positive rate in presence of LD between causal SNPs, with our definition of epistasis. DSS performs best in terms of power and AUC in most scenarios with no or weak LD between causal SNPs. All methods can exhaustively analyze a GWAS with 6.105 SNPs and 15,000 samples in a couple of hours using a GPU.ConclusionThis study confirms that computation time is no longer a limiting factor for performing an exhaustive search of epistasis in large GWAS. For this task, using DSS on SNP pairs with limited LD seems to be a good strategy to achieve the best statistical performance. A combination approach using both DSS and GBOOST is supported by the simulation results and the analysis of the WTCCC dataset demonstrated that this approach can detect distinct genes in epistasis. Finally, weak epistasis between common variants will be detectable with existing methods when GWAS of a few tens of thousands cases and controls are available.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2229-8) contains supplementary material, which is available to authorized users.
Autosomal genetic analyses of blood lipids have yielded key insights for coronary heart disease (CHD). However, X chromosome genetic variation is understudied for blood lipids in large sample sizes. We now analyze genetic and blood lipid data in a high-coverage whole X chromosome sequencing study of 65,322 multi-ancestry participants and perform replication among 456,893 European participants. Common alleles on chromosome Xq23 are strongly associated with reduced total cholesterol, LDL cholesterol, and triglycerides (min P = 8.5 × 10−72), with similar effects for males and females. Chromosome Xq23 lipid-lowering alleles are associated with reduced odds for CHD among 42,545 cases and 591,247 controls (P = 1.7 × 10−4), and reduced odds for diabetes mellitus type 2 among 54,095 cases and 573,885 controls (P = 1.4 × 10−5). Although we observe an association with increased BMI, waist-to-hip ratio adjusted for BMI is reduced, bioimpedance analyses indicate increased gluteofemoral fat, and abdominal MRI analyses indicate reduced visceral adiposity. Co-localization analyses strongly correlate increased CHRDL1 gene expression, particularly in adipose tissue, with reduced concentrations of blood lipids.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.