11Polygenic risk scores (PRS) use the results of genome-wide association studies (GWAS) to predict quantitative 12 phenotypes or disease risk at an individual level. This provides a potential route to the use of genetic data in 13 personalized medical care. However, a major barrier to the use of PRS is that the majority of GWAS come from 14 cohorts of European ancestry. The predictive power of PRS constructed from these studies is substantially 15 lower in non-European ancestry cohorts, although the reasons for this are unclear. To address this question,
16we investigate the performance of PRS for height in cohorts with admixed African and European ancestry, 17 allowing us to evaluate ancestry-related differences in PRS predictive accuracy while controlling for 18 environment and cohort differences. We first show that that the predictive accuracy of height PRS increases 19 linearly with European ancestry and is largely explained by European ancestry segments of the admixed 20 genomes. We show that differences in allele frequencies, recombination rate, and marginal effect sizes across 21 ancestries all contribute to the decrease in predictive power, but none of these effects explain the decrease on 22 its own. Finally, we demonstrate that prediction for admixed individuals can be improved by using a linear 23 combination of PRS that includes ancestry-specific effect sizes, although this approach is at present limited by 24 the small size of non-European ancestry discovery cohorts.
60we lifted over SNP positions to hg19 using liftOver. For WHI, JHS and HRS, we flipped alleles to the positive 61 strand using the appropriate strand files from https://www.well.ox.ac.uk/~wrayner/strand/. We identified 62 individuals with admixed African ancestry in each cohort using a combination of genetic clustering and self-63 reported ancestry as follows: 64 65 3 UKB: This dataset contains several ancestry groups. We selected 8,813 individuals with African or admixed 66 African and European ancestry based on PCA ( Figure S1) and refer to them as UKB_afr. We further filtered this 67 set to contain individuals with at least 5% of African ancestry, resulting in 8,700 individuals ( Table 1). We 68 randomly selected 9,998 European ancestry individuals from the "White British" subset to use as a comparison 69 sample and refer to them as "UKB_eur".
71WHI: This dataset contains both African American and Hispanic participants. We ran unsupervised 72 ADMIXTURE 27 with k=3 and identified 7,285 individuals with self-reported "African American" ancestry with at 73 most 0.8 of the first ADMIXTURE component (which we interpret as reflecting European ancestry), and at most 74 0.05 of the second (which we interpret as reflecting Native American ancestry; Figure S2). We further filtered 75 this set to contain individuals with at least 5% of African ancestry and height between ±2 standard deviations 76 (sd) from the mean (see Figure S12), resulting in 6,863 individuals ( Table 1). We refer to them as "WHI_afr". 77 78 HRS: This dataset contains multip...