2020 IEEE International Conference on Big Data (Big Data) 2020
DOI: 10.1109/bigdata50022.2020.9377843
|View full text |Cite
|
Sign up to set email alerts
|

Sketch and Scale Geo-distributed tSNE and UMAP

Abstract: Running machine learning analytics over geographically distributed datasets is a rapidly arising problem in the world of data management policies ensuring privacy and data security. Visualizing high dimensional data using tools such as t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP) became a common practice for data scientists. Both tools scale poorly in time and memory. While recent optimizations showed successful handling of 10,000 data points, scal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 22 publications
0
1
0
Order By: Relevance
“…UMAP is a dimensionality reduction technique based on Riemannian geometry and algebraic topology theory used for non‐linear dimensionality reduction [29]. It has high speed when processing large data sets and can effectively retain the global structure of the data [31].…”
Section: Algorithm Selection and Evaluation Analysismentioning
confidence: 99%
“…UMAP is a dimensionality reduction technique based on Riemannian geometry and algebraic topology theory used for non‐linear dimensionality reduction [29]. It has high speed when processing large data sets and can effectively retain the global structure of the data [31].…”
Section: Algorithm Selection and Evaluation Analysismentioning
confidence: 99%