Population stratification--allele frequency differences between cases and controls due to systematic ancestry differences-can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.
To identify novel genetic risk factors for rheumatoid arthritis (RA), we conducted a genome-wide association study (GWAS) meta-analysis of 5,539 autoantibody positive RA cases and 20,169 controls of European descent, followed by replication in an independent set of 6,768 RA cases and 8,806 controls. Of 34 SNPs selected for replication, 7 novel RA risk alleles were identified at genome-wide significance (P<5×10−8) in analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5, and PXK. We also refined the risk alleles at two established RA risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed RA risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P<0.05, many of which are validated autoimmune risk alleles, suggesting that most represent bona fide RA risk alleles.
To identify rheumatoid arthritis risk loci in European populations, we conducted a meta-analysis of two published genome-wide association (GWA) studies totaling 3,393 cases and 12,462 controls1,2. We genotyped 31 top-ranked SNPs not previously associated with rheumatoid arthritis in an independent replication of 3,929 autoantibody-positive rheumatoid arthritis cases and 5,807 matched controls from eight separate collections. We identified a common variant at the CD40 gene locus (rs4810485, P = 0.0032 replication, P = 8.2 × 10−9 overall, OR = 0.87). Along with other associations near TRAF1 (refs. 2,3) and TNFAIP3 (refs. 4,5), this implies a central role for the CD40 signaling pathway in rheumatoid arthritis pathogenesis. We also identified association at the CCL21 gene locus (rs2812378, P = 0.00097 replication, P = 2.8 × 10−7 overall), a gene involved in lymphocyte trafficking. Finally, we identified evidence of association at four additional gene loci: MMEL1-TNFRSF14 (rs3890745, P = 0.0035 replication, P = 1.1 × 10−7 overall), CDK6 (rs42041, P = 0.010 replication, P = 4.0 × 10−6 overall), PRKCQ (rs4750316, P = 0.0078 replication, P = 4.4 × 10−6 overall), and KIF5A-PIP4K2C (rs1678542, P = 0.0026 replication, P = 8.8 × 10−8 overall).
To identify susceptibility alleles associated with rheumatoid arthritis, we genotyped 397 individuals with rheumatoid arthritis for 116,204 SNPs and carried out an association analysis in comparison to publicly available genotype data for 1,211 related individuals from the Framingham Heart Study 1 . After evaluating and adjusting for technical and population biases, we identified a SNP at 6q23 (rs10499194, ∼150 kb from TNFAIP3 and OLIG3) that was reproducibly associated with rheumatoid arthritis both in the genome-wide association (GWA) scan and in 5,541 additional case-control samples (P = 10 −3 , GWA scan; P < 10 −6 , replication; P = 10 −9 , combined). In a concurrent study, the Wellcome Trust Case Control Consortium (WTCCC) has reported strong association of rheumatoid arthritis susceptibility to a different SNP located 3.8 kb from rs10499194 (rs6920220; P = 5 × 10 −6 in WTCCC) 2 . We show that these two SNP associations are statistically independent, are each reproducible in the comparison of our data and WTCCC data, and define risk and protective haplotypes for rheumatoid arthritis at 6q23.Rheumatoid arthritis is the most common inflammatory arthritis, affecting up to 1% of the adult population 3 . Two loci (HLA-DRB14 and PTPN22 5 ) have previously been associated with rheumatoid arthritis susceptibility in individuals with circulating antibodies to cyclic citrullinated peptides (CCP). Most of the inheritance of rheumatoid arthritis remains unexplained.To identify additional common variants associated with risk of CCP antibody-associated (CCP + ) rheumatoid arthritis, we conducted a GWA study using the Affymetrix 100K GeneChip microarray in a longitudinal case series of individuals with CCP + rheumatoid arthritis (the Brigham Rheumatoid Arthritis Sequential Study (BRASS) cohort). As we lacked epidemiologically matched controls, we compared case data to publicly available genotype data collected using the same platform from 1,211 related Framingham Heart Study (FHS) participants 1 , drawn from the same geographical region as the individuals in our study (near Boston, Massachusetts, USA).Before comparing allele frequencies between cases and controls, we considered biases that may be introduced by the use of shared controls. Such biases, whether due to nonrandom distribution of technical artifacts 6 or to population differences between case and control data 7,8 , would result in a non-null distribution of test statistics with excess false-positive associations. In an initial analysis of unrelated case-control samples, we assessed the median distribution of test statistics with the genomic-control parameter λ GC 9 (where 1.0 indicates no inflation) and examined the tail of the distribution of association statistics in a comparison of observed and expected P values (Q-Q plot; Fig. 1).Using published data quality control parameters from early studies on this genotyping platform (genotype call rates > 90%, minor allele frequency (MAF) >5%) 1 , we observed λ GC = 1.19 and an excess of associations in the e...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.