1 Background: The relationship between germline genetic variation and breast cancer 2 survival is largely unknown, especially in understudied minority populations who often 3 have poorer survival. Genome-wide association studies (GWAS) have interrogated 4 breast cancer survival but often are underpowered due to subtype heterogeneity and 5 many clinical covariates and detect loci in non-coding regions that are difficult to interpret. 6 Transcriptome-wide association studies (TWAS) show increased power in detecting 7 functionally-relevant loci by leveraging expression quantitative trait loci (eQTLs) from 8 external reference panels in relevant tissues. However, ancestry-or race-specific 9 reference panels may be needed to draw correct inference in ancestrally-diverse cohorts. 10 Such panels for breast cancer are lacking.
12Results: We provide a framework for TWAS for breast cancer in diverse populations, 13 using data from the Carolina Breast Cancer Study (CBCS), a North Carolina population-14 based cohort that oversampled black women. We perform eQTL analysis for 406 breast 15 cancer-related genes to train race-stratified predictive models of tumor expression from 16 germline genotypes. Using these models, we impute expression in independent data from 17 CBCS and TCGA, accounting for sampling variability in assessing performance. These 18 models are not applicable across race, and their predictive performance varies across 19 tumor subtype. Within CBCS (ܰ = 3,828), at a false discovery-adjusted significance of 20 0.10 and stratifying for race, we identify associations in black women near AURKA, 21 CAPN13, PIK3CA, and SERPINB5 via TWAS that are underpowered in GWAS.
22Conclusions: We show that carefully implemented and thoroughly validated TWAS is an 24 efficient approach for understanding the genetics underpinning breast cancer outcomes 25 in diverse populations. 26 27 Keywords: transcriptome-wide analysis (TWAS); breast cancer; expression quantitative 28 trait loci (eQTL); survival; polygenic traits 29 4 Background 30 Breast cancer remains the most common cancer among women in the world [1]. Breast 31 cancer tends to be more aggressive in young women and African American women, 32 though underlying germline determinants of poor outcomes are not well-studied. Cohorts 33 that represent understudied minority populations, like the Carolina Breast Cancer Study 34 (CBCS), have identified differences in healthcare access, socioeconomics, and 35 environmental exposures associated with disparities in outcome [2-4], but more targeted 36 genomic studies are necessary to interrogate these disparities from a biologic and genetic 37 perspective. 38 39 Few genome-wide association studies (GWAS) have studied the relationship between 40 germline variation and survival outcomes in breast cancer, with most focusing instead on 41 genetic predictors of risk [5,6]. Recently, GWAS have shown evidence of association 42 between candidate common germline variants and breast cancer survival, but these 43 studies are often underpow...