Background
Breast cancer is a highly heterogeneous disease that comprises multiple biological components. Owing its diversity, patients have different prognostic outcomes; hence, early diagnosis and accurate subtype prediction are critical for treatment. Standardized breast cancer subtyping systems, mainly based on single-omics datasets, have been developed to ensure proper treatment in a systematic manner. Recently, multi-omics data integration has attracted attention to provide a comprehensive view of patients but poses a challenge due to the high dimensionality. In recent years, deep learning-based approaches have been proposed, but they still present several limitations.
Results
In this study, we describe moBRCA-net, an interpretable deep learning-based breast cancer subtype classification framework that uses multi-omics datasets. Three omics datasets comprising gene expression, DNA methylation and microRNA expression data were integrated while considering the biological relationships among them, and a self-attention module was applied to each omics dataset to capture the relative importance of each feature. The features were then transformed to new representations considering the respective learned importance, allowing moBRCA-net to predict the subtype.
Conclusions
Experimental results confirmed that moBRCA-net has a significantly enhanced performance compared with other methods, and the effectiveness of multi-omics integration and omics-level attention were identified. moBRCA-net is publicly available at https://github.com/cbi-bioinfo/moBRCA-net.