Background
Single-cell RNA-sequencing technologies provide a powerful tool for systematic dissection of cellular heterogeneity. However, the prevalence of dropout events imposes complications during data analysis and, despite numerous efforts from the community, this challenge has yet to be solved.
Results
Here we present a computational method, called RESCUE, to mitigate the dropout problem by imputing gene expression levels using information from other cells with similar patterns. Unlike existing methods, we use an ensemble-based approach to minimize the feature selection bias on imputation. By comparative analysis of simulated and real single-cell RNA-seq datasets, we show that RESCUE outperforms existing methods in terms of imputation accuracy which leads to more precise cell-type identification.
Conclusions
Taken together, these results suggest that RESCUE is a useful tool for mitigating dropouts in single-cell RNA-seq data. RESCUE is implemented in R and available at
https://github.com/seasamgo/rescue
.
Electronic supplementary material
The online version of this article (10.1186/s12859-019-2977-0) contains supplementary material, which is available to authorized users.
Patients with relapsed/refractory (R/R) diffuse large B-cell lymphoma (DLBCL) have heterogeneous outcomes; durable remissions are infrequently observed with standard approaches. Circulating tumor DNA (ctDNA) assessment is a sensitive, potentially prognostic tool in this setting. We assessed baseline ctDNA to identify patients with R/R DLBCL at high risk of relapse after receiving polatuzumab vedotin and bendamustine + rituximab (BR), or BR alone. Patients were transplant-ineligible and had received ≥1 prior line of therapy. The ctDNA assay, based on a customized panel of recurrently mutated genes in DLBCL, measured mutant molecules per mL (MMPM) at baseline and end of treatment (EOT). Endpoints included progression-free survival (PFS) and overall survival (OS) in subgroups stratified by baseline ctDNA, and log-fold change in ctDNA at EOT versus baseline. In biomarker-evaluable patients (n=33), baseline ctDNA level correlated with serum lactate dehydrogenase (LDH) concentration, number of prior therapies, stage, and International Prognostic Index (IPI). After adjusting for number of prior therapies ≥2, IPI score ≥3, and LDH above the upper limit of normal, high (> median) baseline ctDNA MMPM was independently prognostic for shorter PFS (adjusted HR 0.18 [95% CI: 0.05-0.65]) and OS (adjusted HR 0.20 [95% CI: 0.06-0.68]). In 23 patients with baseline and EOT samples, a significantly greater decrease in ctDNA MMPM was observed in patients with complete response (n=13) than those without complete response (n=10); P=0.0025. Baseline ctDNA assessment may identify patients at high risk of progression, and should be further evaluated as a monitoring tool in R/R DLBCL. ClinicalTrials.gov identifier: NCT02257567.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.