2022
DOI: 10.26508/lsa.202201591
|View full text |Cite
|
Sign up to set email alerts
|

Resolution of the curse of dimensionality in single-cell RNA sequencing data analysis

Abstract: Single-cell RNA sequencing (scRNA-seq) can determine gene expression in numerous individual cells simultaneously, promoting progress in the biomedical sciences. However, scRNA-seq data are high-dimensional with substantial technical noise, including dropouts. During analysis of scRNA-seq data, such noise engenders a statistical problem known as the curse of dimensionality (COD). Based on high-dimensional statistics, we herein formulate a noise reduction method, RECODE (resolution of the curse of dimensionality… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2

Relationship

4
4

Authors

Journals

citations
Cited by 15 publications
(21 citation statements)
references
References 64 publications
0
21
0
Order By: Relevance
“…When determining distances between different cell states in high-dimensional space, we often face the challenge of the curse of dimensionality ( Imoto et al, 2022 ). The higher the number of dimensions, the more severe the issue, as noise disproportionately adds up to undermine the true (biological) signal when considering multiple dimensions.…”
Section: Discussionmentioning
confidence: 99%
“…When determining distances between different cell states in high-dimensional space, we often face the challenge of the curse of dimensionality ( Imoto et al, 2022 ). The higher the number of dimensions, the more severe the issue, as noise disproportionately adds up to undermine the true (biological) signal when considering multiple dimensions.…”
Section: Discussionmentioning
confidence: 99%
“…After putatively excluding doublets/multiplets using the Scrublet python package (v. 0.2.1; Wolock et al , 2019), interspecies doublets composed of mouse somatic cells and cyESC‐derived cells were removed if common cell barcodes were detected in both filtered count matrices mapped to GRCm38.p6 and MacFas5.0. Then, the filtered count matrix was applied to noise reduction by RECODE (resolution of the curse of dimensionality; Imoto et al , 2022) and normalized by 100 k.…”
Section: Methodsmentioning
confidence: 99%
“…Note that “unclassified” cells in cy fetal ovary germ cells (Mizuta et al , 2022) were not included. Their raw count matrices were again processed by RECODE (Imoto et al , 2022), normalized by 100 k, and then log‐transformed [log (size‐scaled (ss) UMI + 1)]. Batch corrections, identification of highly variable genes (HVGs), and cell clustering were performed as described previously (Satija et al , 2015; Mizuta et al , 2022), and the germ cells were classified into 12 clusters.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, we used these cell types in the later stage (n = 1, 876 cells and d = 27, 998 genes), pre-endocrine and mature endocrines (α-, β-, δ-, and ε-cells), which were previously annotated in the literature [1,3]. We analyzed the scRNA-seq data as follows (see Appendix B for details): As a pre-analysis of V-Mapper, we used RECODE (resolution of the curse of dimensionality) [12] and RNA velocity [3,14] (Fig. 5).…”
Section: Application To Single-cell Gene Expression Datamentioning
confidence: 99%
“…The original scRNA-seq data of the pancreas endocrine cell was obtained from the datasets.pancreas function in the scvelo package [3]. We created denoised and velocity data by applying RECODE [12] and RNA-velocity with the stochastic mode parameter [3] to the original data. Next, we extracted data regarding the cells of pre-endocrine and α-, β-, δ-, and ε-cell types from the denoised and velocity data.…”
Section: B Analysis In Sectionmentioning
confidence: 99%