2019
DOI: 10.1101/671404
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Embedding to Reference t-SNE Space Addresses Batch Effects in Single-Cell Classification

Abstract: Dimensionality reduction techniques, such as t-SNE, can construct informative visualizations of high-dimensional data. When working with multiple data sets, a straightforward application of these methods often fails; instead of revealing underlying classes, the resulting visualizations expose data set-specific clusters. To circumvent these batch effects, we propose an embedding procedure that takes a t-SNE visualization constructed on a reference data set and uses it as a scaffold for embedding new data. The n… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…t-SNE is an unsupervised nonlinear probabilistic exploratory and visualization algorithm, which embeds high-dimensional data for visualization in a low-dimensional space. 36 In brief, the principle behind t-SNE is calculation and comparison of probabilities of proximity in higher- and lower-dimensional space. This comparison is the premise for visualization, where the differences are attempted to be minimized in a lower-dimensional space, using local minima by applying a gradient descent.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…t-SNE is an unsupervised nonlinear probabilistic exploratory and visualization algorithm, which embeds high-dimensional data for visualization in a low-dimensional space. 36 In brief, the principle behind t-SNE is calculation and comparison of probabilities of proximity in higher- and lower-dimensional space. This comparison is the premise for visualization, where the differences are attempted to be minimized in a lower-dimensional space, using local minima by applying a gradient descent.…”
Section: Methodsmentioning
confidence: 99%
“…t-SNE is an unsupervised nonlinear probabilistic exploratory and visualization algorithm, which embeds high-dimensional data for visualization in a low-dimensional space . In brief, the principle behind t-SNE is calculation and comparison of probabilities of proximity in higher- and lower-dimensional space.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Gene expression is normalized using regularized negative binomial regression as implemented in SCTransform (18), which also provides variance estimates for each gene. The top 4,000 highly variable genes (HVGs) are used to reduce the dimensionality of the gene-cell matrix through PCA, and the first 20 principal components are used as an initialization for a two-dimensional Fast Fourier Transformaccelerated t-SNE embedding (19,20). The cells are clustered in PCA space using Louvain modularity optimization after being embedded in a shared nearest-neighbors graph.…”
Section: Initial Clusteringmentioning
confidence: 99%
“…We extend the t-SNE algorithm to enable the embedding of new data points to a reference embedding [37]. We adapt the standard t-SNE formulation from Eqs.…”
Section: Adding Data To a Reference Embeddingmentioning
confidence: 99%