Currently, using biopsy specimens for the early diagnosis of colorectal cancer (CRC) is not entirely reliable due to insufficient sampling amount and inaccurate sampling location. Thus, it is necessary to develop a signature that can accurately identify patients with CRC under these clinical scenarios. Based on the relative expression orderings of genes within individual samples, we developed a qualitative transcriptional signature to discriminate CRC tissues, including CRC adjacent normal tissues from non‐CRC individuals. The signature was validated using multiple microarray and RNA sequencing data from different sources. In the training data, a signature consisting of 7 gene pairs was identified. It was well validated in both biopsy and surgical resection specimens from multiple datasets measured by different platforms. For biopsy specimens, 97.6% of 42 CRC tissues and 94.5% of 163 non‐CRC (normal or inflammatory bowel disease) tissues were correctly classified. For surgically resected specimens, 99.5% of 854 CRC tissues and 96.3% of 81 CRC adjacent normal tissues were correctly identified as CRC. Notably, we additionally measured 33 CRC biopsy specimens by the Affymetrix platform and 13 CRC surgical resection specimens, with different proportions of tumor epithelial cells, ranging from 40% to 100%, by the RNA sequencing platform, and all these samples were correctly identified as CRC. The signature can be used for the early diagnosis of CRC, which is also suitable for minimum biopsy specimens and inaccurately sampled specimens, and thus has potential value for clinical application.
Background
Stemness is defined as the potential of cells for self-renewal and differentiation. Many transcriptome-based methods for stemness evaluation have been proposed. However, all these methods showed low negative correlations with differentiation time and can’t leverage the existing experimentally validated stem cells to recognize the stem-like cells.
Methods
Here, we constructed a stemness index for single-cell samples (StemSC) based on relative expression orderings (REO) of gene pairs. Firstly, we identified the stemness-related genes by selecting the genes significantly related to differentiation time. Then, we used 13 RNA-seq datasets from both the bulk and single-cell embryonic stem cell (ESC) samples to construct the reference REOs. Finally, the StemSC value of a given sample was calculated as the percentage of gene pairs with the same REOs as the ESC samples.
Results
We validated the StemSC by its higher negative correlations with differentiation time in eight normal datasets and its higher positive correlations with tumor dedifferentiation in three colorectal cancer datasets and four glioma datasets. Besides, the robust of StemSC to batch effect enabled us to leverage the existing experimentally validated cancer stem cells to recognize the stem-like cells in other independent tumor datasets. And the recognized stem-like tumor cells had fewer interactions with anti-tumor immune cells. Further survival analysis showed the immunotherapy-treated patients with high stemness had worse survival than those with low stemness.
Conclusions
StemSC is a better stemness index to calculate the stemness across datasets, which can help researchers explore the effect of stemness on other biological processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.