Identifying how cell types and their abundances evolve during tumor progression is critical to understanding the mechanisms and identifying predictors of metastasis. Single-cell RNA sequencing (scRNA-seq) has been especially promising in resolving heterogeneity of expression programs at the single cell level but is not always available, for example for large cohort studies or longitudinal analysis of archived samples. In such cases, cell subpopulations must be inferred by deconvolution, a process that can infer single-cell genomic data from bulk data but has limited ability to resolve fine clonal structure. We extend our previous bulk genomic deconvolution tool, Robust and Accurate Deconvolution (RAD), to establish a new method, scRAD, that can use reference scRNA-seq to interpret sample collections for which only bulk RNA-seq is available for some samples, e.g., clonally resolving archived primary (PRM) tissues using scRNA-seq from metastases (METs). We preprocess scRNA-seq data to accurately represent gene expression profiles, yielding a signature matrix S then extend our RAD method via a regularization term to deconvolve bulk data while maximizing consistency with S. We validate our method on semi-synthetic data derived from human PRM breast cancer cases and bone and ovary METs, showing that scRAD improves inference of single-cell gene expression profiles and their frequencies relative to the prior RAD with random initialization or initialization using the single-cell matrix S (Table 1). We then apply scRAD to a collection of paired PRM and MET tumors to quantify progression changes in common cell types. One-sided Kaplan-Meier analysis shows that tumors inferred to increase the mast cell fraction from PRM to MET exhibit lower overall survival (p<0.05), consistent with the role of mast cells in metastatic growth and propagation. Tumors that show increased macrophage cell fraction from PRM to MET show improved overall survival (p<0.04), consistent with the role of immune infiltration in survival. mean square error (MSE) of gene expression and mixture fraction inference on semi-simulated data Method RAD with random initialization RAD with random initialization RAD with random initialization RAD initialized with S RAD initialized with S RAD initialized with S scRAD scRAD scRAD Sample number 2 4 8 2 4 8 2 4 8 Gene Expression MSE 0.41 0.28 0.37 0.31 0.27 0.37 0.22 0.15 0.13 Mixture fraction MSE 0.58 0.82 0.59 0.39 0.83 0.58 0.19 0.20 0.22 Citation Format: Haoyun Lei, Xiaoyan A. Guo, Yifeng Tao, Kai Ding, Xuecong Fu, Steffi Oesterreich, Adrian V. Lee, Russell Schwartz. Improved deconvolution of combined bulk and single-cell RNA-sequencing data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 5031.
Motivation Identifying cell types and their abundances and how these evolve during tumor progression is critical to understanding the mechanisms of metastasis and identifying predictors of metastatic potential that can guide the development of new diagnostics or therapeutics. Single-cell RNA sequencing (scRNA-seq) has been especially promising in resolving heterogeneity of expression programs at the single-cell level, but is not always feasible, e.g. for large cohort studies or longitudinal analysis of archived samples. In such cases, clonal subpopulations may still be inferred via genomic deconvolution, but deconvolution methods have limited ability to resolve fine clonal structure and may require reference cell type profiles that are missing or imprecise. Prior methods can eliminate the need for reference profiles but show unstable performance when few bulk samples are available. Results In this work, we develop a new method using reference scRNA-seq to interpret sample collections for which only bulk RNA-seq is available for some samples, e.g. clonally resolving archived primary tissues using scRNA-seq from metastases. By integrating such information in a Quadratic Programming framework, our method can recover more accurate cell types and corresponding cell type abundances in bulk samples. Application to a breast tumor bone metastases dataset confirms the power of scRNA-seq data to improve cell type inference and quantification in same-patient bulk samples. Availability and implementation Source code is available on Github at https://github.com/CMUSchwartzLab/RADs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.