2022
DOI: 10.1007/s41060-022-00348-7
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing graph layout by t-SNE perplexity estimation

Abstract: Perplexity is one of the key parameters of dimensionality reduction algorithm of t-distributed stochastic neighbor embedding (t-SNE). In this paper, we investigated the relationship of t-SNE perplexity and graph layout evaluation metrics including graph stress, preserved neighborhood information and visual inspection. As we found that a small perplexity is correlated with a relative higher normalized stress while preserving neighborhood information with a higher precision but less global structure information,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…Multidimensional datasets often become sparse as the number of features increases, which can negatively impact clustering success. To address this, increasing the sample size or using dimensionality reduction algorithms like Principal Component Analysis, Linear Discriminant Analysis, Factor Analysis, t-SNE, and others can help reduce dimensions and improve data density (Gao et al, 2021;Wang et al, 2021;Mair, 2018;Xiao et al, 2023;Groth et al, 2013). Before creating a hypothesis, it is critical to evaluate the impact of each feature on cluster results using methods such as information retrieval.…”
Section: Dataset Preparation Proceduresmentioning
confidence: 99%
See 1 more Smart Citation
“…Multidimensional datasets often become sparse as the number of features increases, which can negatively impact clustering success. To address this, increasing the sample size or using dimensionality reduction algorithms like Principal Component Analysis, Linear Discriminant Analysis, Factor Analysis, t-SNE, and others can help reduce dimensions and improve data density (Gao et al, 2021;Wang et al, 2021;Mair, 2018;Xiao et al, 2023;Groth et al, 2013). Before creating a hypothesis, it is critical to evaluate the impact of each feature on cluster results using methods such as information retrieval.…”
Section: Dataset Preparation Proceduresmentioning
confidence: 99%
“…When the data size is too large, using principal component analysis (PCA) or t-distributed stochastic neighbor embedding (Xiao et al, 2023) can first reduce the size of the data and then apply DBSCAN.…”
Section: Dbscan With Data Reduced In Size With T-snementioning
confidence: 99%