The low capture rate of expressed RNAs from single-cell sequencing technology is one of the major obstacles to downstream functional genomics analyses. Recently, a number of imputation methods have emerged for single-cell transcriptome data, however, recovering missing values in very sparse expression matrices remains a substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted Decomposition of Gene Expression), to impute gene expression matrices by using a biased low-rank matrix decomposition method. WEDGE successfully recovered expression matrices, reproduced the cell-wise and gene-wise correlations and improved the clustering of cells, performing impressively for applications with sparse datasets. Overall, this study shows a potent approach for imputing sparse expression matrix data, and our WEDGE algorithm should help many researchers to more profitably explore the biological meanings embedded in their single-cell RNA sequencing datasets. The source code of WEDGE has been released at https://github.com/QuKunLab/WEDGE.
20The low capture rate of expressed RNAs from single-cell sequencing technology is 21 one of the major obstacles to downstream functional genomics analyses. Recently, a 22 number of recovery methods have emerged to impute single-cell transcriptome profiles, 23 however, restoring missing values in very sparse expression matrices remains a 24 substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted 25Decomposition of Gene Expression), which imputes expression matrix by using a low-26 rank matrix decomposition method. WEDGE successfully restored expression 27 matrices, reproduced the cell-wise and gene-wise correlations, and improved the 28 clustering of cells, performing impressively for applications with multiple cell type 29 datasets with high dropout rates. Overall, this study demonstrates a potent approach 30 for recovering sparse expression matrix data, and our WEDGE algorithm should help 31 many researchers to more profitably explore the biological meanings embedded in 32 their scRNA-seq datasets. 33 34
Unsupervised clustering is a fundamental step of single-cell RNA sequencing data analysis. This issue has inspired several clustering methods to classify cells in single-cell RNA sequencing data. However, accurate prediction of the cell clusters remains a substantial challenge. In this study, we propose a new algorithm for single-cell RNA sequencing data clustering based on Sparse Optimization and low-rank matrix factorization (scSO). We applied our scSO algorithm to analyze multiple benchmark datasets and showed that the cluster number predicted by scSO was close to the number of reference cell types and that most cells were correctly classified. Our scSO algorithm is available at https://github.com/QuKunLab/scSO. Overall, this study demonstrates a potent cell clustering approach that can help researchers distinguish cell types in single-cell RNA sequencing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.