Background: Understanding cellular and molecular heterogeneity in glioblastoma (GBM), the most common and aggressive primary brain malignancy, is a crucial step towards the development of effective therapies. Besides the inter-patient variability, the presence of multiple cell populations within tumors calls for the need to develop modeling strategies able to extract the molecular signatures driving tumor evolution and treatment failure. With the advances in single-cell RNA Sequencing (scRNA-Seq), tumors can now be dissected at the cell level, unveiling information from their life history to their clinical implications. Results: We propose a classification setting based on GBM scRNA-Seq data, through sparse logistic regression, where different cell populations (neoplastic and normal cells) are taken as classes. The goal is to identify gene features discriminating between the classes, but also those shared by different neoplastic clones. The latter will be approached via the network-based twiner regularizer to identify gene signatures shared by neoplastic cells from the tumor core and infiltrating neoplastic cells originated from the tumor periphery, as putative disease biomarkers to target multiple neoplastic clones. Our analysis is supported by the literature through the identification of several known molecular players in GBM. Moreover, the relevance of the selected genes was confirmed by their significance in the survival outcomes in bulk GBM RNA-Seq data, as well as their association with several Gene Ontology (GO) biological process terms.
Conclusions:We presented a methodology intended to identify genes discriminating between GBM clones, but also those playing a similar role in different GBM neoplastic clones (including migrating cells), therefore potential targets for therapy research. Our results contribute to a deeper understanding on the genetic features behind GBM, by disclosing novel therapeutic directions accounting for GBM heterogeneity.