2023
DOI: 10.1038/s41598-023-32966-x
|View full text |Cite
|
Sign up to set email alerts
|

Effect of distance measures on confidences of t-SNE embeddings and its implications on clustering for scRNA-seq data

Abstract: Arguably one of the most famous dimensionality reduction algorithms of today is t-distributed stochastic neighbor embedding (t-SNE). Although being widely used for the visualization of scRNA-seq data, it is prone to errors as any algorithm and may lead to inaccurate interpretations of the visualized data. A reasonable way to avoid misinterpretations is to quantify the reliability of the visualizations. The focus of this work is first to find the best possible way to predict sample-based confidence scores for t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 33 publications
0
3
0
Order By: Relevance
“…A multinomial logistic regression (MLR) model, using the multinom function ("nnet", v7. [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19], was applied to calculate the contribution of specific reactions to pre-defined clusters of GSMMs (i.e. phylogenetic clades/tissue types).…”
Section: Additional Computational Analysesmentioning
confidence: 99%
See 1 more Smart Citation
“…A multinomial logistic regression (MLR) model, using the multinom function ("nnet", v7. [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19], was applied to calculate the contribution of specific reactions to pre-defined clusters of GSMMs (i.e. phylogenetic clades/tissue types).…”
Section: Additional Computational Analysesmentioning
confidence: 99%
“…However, this method lacks insight into specific pathways driving heterogeneity. t-SNE analysis [10,14], while effective in clustering GSMMs consistently, requires specific prerequisites, such as distance metrics, and hyperparameters, which might result in erroneous clustering outcomes [15][16][17], and may lack reproducibility due to its non-deterministic nature. Additionally, it does not provide a straightforward identification of key variables driving clustering.…”
Section: Introductionmentioning
confidence: 99%
“…Binary data have been efficaciously subjected to t-SNE analysis [9,18], yielding consistent clustering of GSMMs. However, t-SNE mandates specific prerequisites, such as distance metrics, and hyperparameters, which might result in erroneous clustering outcomes [19][20][21]. Additionally, the inherent non-deterministic nature of t-SNE poses reproducibility concerns.…”
Section: Introductionmentioning
confidence: 99%