Background
Recent studies have indicated that a special class of long non-coding RNAs (lncRNAs), namely Transcribed-Ultraconservative Regions are transcribed from specific DNA regions (T-UCRs), 100$$\%$$
%
conserved in human, mouse, and rat genomes. This is noticeable, as lncRNAs are usually poorly conserved. Despite their peculiarities, T-UCRs remain very understudied in many diseases, including cancer and, yet, it is known that dysregulation of T-UCRs is associated with cancer as well as with human neurological, cardiovascular, and developmental pathologies. We have recently reported the T-UCR uc.8+ as a potential prognostic biomarker in bladder cancer.
Results
The aim of this work is to develop a methodology, based on machine learning techniques, for the selection of a predictive signature panel for bladder cancer onset. To this end, we analyzed the expression profiles of T-UCRs from surgically removed normal and bladder cancer tissues, by using custom expression microarray. Bladder tissue samples from 24 bladder cancer patients (12 Low Grade and 12 High Grade), with complete clinical data, and 17 control samples from normal bladder epithelium were analysed. After the selection of preferentially expressed and statistically significant T-UCRs, we adopted an ensemble of statistical and machine learning based approaches (i.e., logistic regression, Random Forest, XGBoost and LASSO) for ranking the most important diagnostic molecules. We identified a signature panel of 13 selected T-UCRs with altered expression profiles in cancer, able to efficiently discriminate between normal and bladder cancer patient samples. Also, using this signature panel, we classified bladder cancer patients in four groups, each characterized by a different survival extent. As expected, the group including only Low Grade bladder cancer patients had greater overall survival than patients with the majority of High Grade bladder cancer. However, a specific signature of deregulated T-UCRs identifies sub-types of bladder cancer patients with different prognosis regardless of the bladder cancer Grade.
Conclusions
Here we present the results for the classification of bladder cancer (Low and High Grade) patient samples and normal bladder epithelium controls by using a machine learning application. The T-UCR’s panel can be used for learning an eXplainable Artificial Intelligent model and develop a robust decision support system for bladder cancer early diagnosis providing urinary T-UCRs data of new patients. The use of this system instead of the current methodology will result in a non-invasive approach, reducing uncomfortable procedures (such as cystoscopy) for the patients. Overall, these results raise the possibility of new automatic systems, which could help the RNA-based prognosis and/or the cancer therapy in bladder cancer patients, and demonstrate the successful application of Artificial Intelligence to the definition of an independent prognostic biomarker panel.