The meta-analysis has become a widely used tool for many applications in bioinformatics, including genome-wide association studies. A commonly used approach for meta-analysis is the fixed effects model approach, for which there are two popular methods: the inverse variance-weighted average method and weighted sum of z-scores method. Although previous studies have shown that the two methods perform similarly, their characteristics and their relationship have not been thoroughly investigated. In this paper, we investigate the optimal characteristics of the two methods and show the connection between the two methods. We demonstrate that the each method is optimized for a unique goal, which gives us insight into the optimal weights for the weighted sum of z-scores method. We examine the connection between the two methods both analytically and empirically and show that their resulting statistics become equivalent under certain assumptions. Finally, we apply both methods to the Wellcome Trust Case Control Consortium data and demonstrate that the two methods can give distinct results in certain study designs.
Summary Introduction The discovery of disease-associated loci through genome-wide association studies (GWAS) is the leading approach to the identification of novel biological pathways for human disease. To date, GWAS have had been limited by relatively small sample sizes and yielded relatively few loci associated with ischemic stroke The National Institute of Neurological Disorders Stroke Genetics Network (NINDS-SiGN) is an international consortium that has taken a systematic approach to phenotyping and produced the largest ischemic stroke GWAS to date. Methods In order to identify genetic loci associated with ischemic stroke, we performed a two-stage genome-wide association study. The first stage consisted of 16,851 cases with state-of-the-art phenotyping and 32,473 stroke-free controls. Cases were aged 16 to 104 years, recruited between 1989 and 2012, and subtyped by centrally trained and certified investigators using the web-based protocol, Causative Classification of Stroke (CCS). We constructed case-control strata by identify samples genotyped on (nearly) identical arrays and of similar genetic ancestral background. Data was cleaned and imputed using dense imputation reference panels generated from whole-genome sequence data. Genome-wide testing was performed within each stratum for each available phenotype, and summary level results were combined using inverse variance-weighted fixed effects meta-analysis. The second stage consisted of in silico look-ups of 1,372 SNPs in 20,941 cases and 364,736 stroke-free controls, with cases previously subtyped using the TOAST classification system according to local standards. The two stages were then jointly analyzed in a final meta-analysis. Findings We identified a novel locus at 1p13.2 near TSPAN2 associated with large artery atherosclerosis (LAA)-related stroke (stage I OR for the G allele at rs12122341 = 1·21, p = 4.50 × 10−8; stage II OR = 1·19, p = 1·30 × 10−9). We also confirmed four loci robustly associated with ischemic stroke and reported in prior studies, including PITX2 and ZFHX3 for cardioembolic stroke, and HDAC9 for LAA stroke. The 12q24 locus near ALDH2, originally associated with all ischemic stroke but not with any specific subtype, exceeded genome-wide significance in the meta-analysis of small artery stroke. Other loci, including NINJ2, were not confirmed. Interpretation Our results identify a novel LAA-stroke susceptibility gene and now indicate that all loci implicated by GWAS to date are subtype specific. Follow-up studies will be necessary to determine whether the locus near TSPAN2 yields a novel therapeutic approach to stroke prevention. Given the subtype-specificity of these associations, the rich phenotyping available in SiGN is likely to prove vital for further genetic discovery in ischemic stroke. Funding National Institute of Neurological Disorders and Stroke (NINDS), National Institutes of Health (NIH).
There is growing evidence of shared risk alleles between complex traits (pleiotropy), including autoimmune and neuropsychiatric diseases. This might be due to sharing between all individuals (whole-group pleiotropy), or a subset of individuals within a genetically heterogeneous cohort (subgroup heterogeneity). BUHMBOX is a well-powered statistic distinguishing between these two situations using genotype data. We observed a shared genetic basis between 11 autoimmune diseases and type 1 diabetes (T1D, p<10−4), and 11 autoimmune diseases and rheumatoid arthritis (RA, p<10−3). This sharing was not explained by subgroup heterogeneity (corrected pBUHMBOX>0.2, 6,670 T1D cases and 7,279 RA cases). Genetic sharing between seronegative and seropostive RA (p<10−9) had significant evidence of subgroup heterogeneity, suggesting a subgroup of seropositive-like cases within seronegative cases (pBUHMBOX=0.008, 2,406 seronegative RA cases). We also observed a shared genetic basis between major depressive disorder (MDD) and schizophrenia (p<10−4) that was not explained by subgroup heterogeneity (pBUHMBOX=0.28 in 9,238 MDD cases).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.