22The emergence of large-scale multi-omics data warrants method development for data integration.
23Genomic studies from cancer patients have identified epigenetic and genetic regulatorssuch as 24 methylation marks, somatic mutations, and somatic copy number alterations (SCNAs), among othersas 25 predictive features of cancer outcome. However, identification of "driver genes" associated with a given 26 alteration remains a challenge. To this end, we developed a computational tool, iEDGE, to model cis and 27 trans effects of (epi-)DNA alterations and identify potential cis driver genes, where cis and trans genes 28 denote those genes falling within and outside the genomic boundaries of a given (epi-)genetic alteration, 29 respectively.
30First, iEDGE identifies the cis and trans genes associated with the presence/absence of a particular epi-31 DNA alteration across samples. Tests of statistical mediation are then performed to determine the cis 32 genes predictive of the trans gene expression. Finally, cis and trans effects are annotated by pathway 33 enrichment analysis to gain insights into the underlying regulatory networks.
34We used iEDGE to perform integrative analysis of SCNAs and gene expression data from breast cancer 35 and 18 additional cancer types included in The Cancer Genome Atlas (TCGA). Notably, cis gene drivers 36 identified by iEDGE were found to be significantly enriched for known driver genes from multiple 37 compendia of validated oncogenes and tumor suppressors, suggesting that the remainder are of equal 38 importance. Furthermore, predicted drivers were enriched for functionally relevant cancer genes with 39 amplification-driven dependencies, which are of potential prognostic and therapeutic value. All the 40 analyses results are accessible at https://montilab.bu.edu/iEDGE. 41 55 development. Furthermore, the generated analysis results are often static, and not accessible in an 56 interactive fashion. The approach presented here aims to address both of these shortcomings.
57The central hypothesis behind integrative approaches is that the integration of multi-level genomics data 58 allows for prioritization of putative cancer "drivers" that are potential biomarkers or therapeutic targets.
59An important genetic alteration type in cancer is somatic copy-number alterations (SCNAs). SCNAs 60 harbor many known cancer drivers (oncogenes or tumor suppressors) and play an important role in cancer 61 initiation and/or progression through activation of oncogenes and inactivation of tumor suppressors 62 (Beroukhim et al. 2010; Zack et al. 2013). Identification of unknown SCNA-associated cancer drivers is 63 complicated by the fact that each SCNA contains many genes, often even a complete chromosome arm, 64 the majority of which is likely not to confer any selective advantage (i.e., passengers). One approach to 111 SCNAs. For instance, gene sets HALLMARK_ESTROGEN_RESPONSE_LATE, 112 HALLMARK_ESTROGEN_RESPONSE_EARLY, HALLMARK_G2M_CHECKPOINT were 113 significant hits in more than 75% of SCNAs. These gene s...