2014 IEEE International Conference on Big Data (Big Data) 2014
DOI: 10.1109/bigdata.2014.7004253
|View full text |Cite
|
Sign up to set email alerts
|

Topic similarity networks: Visual analytics for large document sets

Abstract: We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent topics in text collections and links represent similarity among topics. We describe efficient and effective approaches to both building and labeling such networks. Visualizations of topic models based on these networks are shown to be a powerful means of exploring, character… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
9
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 16 publications
(9 citation statements)
references
References 20 publications
0
9
0
Order By: Relevance
“…Gretarsson et al [9] estimated the similarity of latent topics by computing the L 1 distance. Later, Maiya et al [5] adopted the Hellinger distance metric to calculate the similarity between topics. On the other hand, the most common distance metric used in the literature is the cosine similarity [6,10,11].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Gretarsson et al [9] estimated the similarity of latent topics by computing the L 1 distance. Later, Maiya et al [5] adopted the Hellinger distance metric to calculate the similarity between topics. On the other hand, the most common distance metric used in the literature is the cosine similarity [6,10,11].…”
Section: Related Workmentioning
confidence: 99%
“…It is often used as a topic similarity metric in prior work [6,9]. The Hellinger distance-based metric (HDM) is often used to quantify the similarity between a pair of probability distributions, as in [5].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Network-based similarity measurements can be used for classification on large scale document collections [15], to provide unsupervised hierarchic similarity structure [16] and also to benefit visualizations to better understand "similarity" [17]. Deepwalk [18] seeks vector representations using recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs.…”
Section: Related Workmentioning
confidence: 99%