BackgroundGenome wide association studies (GWAS) are greatly accelerating the pace of discovery of germline variants underlying the genetic architecture of sporadic breast cancer predisposition. We have built the first knowledge-base dedicated to this field and used it to generate hypotheses on the molecular pathways involved in disease susceptibility.
MethodsWe gathered data on the common single nucleotide polymorphisms (SNPs) discovered by breast cancer risk GWAS. Information on SNP functional effect (including data on linkage disequilibrium, expression quantitative trait locus, and SNP relationship with regulatory motifs or promoter/enhancer histone marks) was utilized to select putative breast cancer predisposition genes (BCPGs). Ultimately, BCPGs were subject to pathway (gene set enrichment) analysis and network (protein-protein interaction) analysis.
ResultsData from 38 studies (28 original case-control GWAS enrolling 383,260 patients with breast cancer; and 10 GWAS meta-analyses) were retrieved. Overall, 281 SNPs were associated with the risk of breast cancer with a P-value <10E-06 and a minor allele frequency >1%. Based on functional information, we identified 296 putative BCPGs. Primary analysis showed that germline perturbation of classical cancer-related pathways (e.g., apoptosis, cell cycle, signal transduction including estrogen receptor signaling) play a significant role in breast carcinogenesis. Other less established pathways (such as ribosome and peroxisome machineries) were also highlighted. In the main subgroup analysis, we considered the BCPGs encoding transcription factors 4 (n=36), which in turn target 252 genes. Interestingly, pathway and network analysis of these genes yielded results resembling those of primary analyses, suggesting that most of the effect of genetic variation on disease risk hinges upon transcriptional regulons.
ConclusionsThis knowledge-base, which is freely available and will be annually updated, can inform future studies dedicated to breast cancer molecular epidemiology as well as genetic susceptibility and development.