Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval 2013
DOI: 10.1145/2484028.2484166
|View full text |Cite
|
Sign up to set email alerts
|

Improving LDA topic models for microblogs via tweet pooling and automatic labeling

Abstract: Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through various pooling … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
261
0
2

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 384 publications
(270 citation statements)
references
References 12 publications
1
261
0
2
Order By: Relevance
“…In the M step of EM, φ D×N ×K is used to update the variational Dirichlet prior λ K×N of β. The parameters' optimisation formulas in the EM algorithm are shown in Equation (2). Finally, β and θ can be obtained when the lower bound converges.…”
Section: Two Lda Implementations: Sampling and Vb Approachesmentioning
confidence: 99%
See 4 more Smart Citations
“…In the M step of EM, φ D×N ×K is used to update the variational Dirichlet prior λ K×N of β. The parameters' optimisation formulas in the EM algorithm are shown in Equation (2). Finally, β and θ can be obtained when the lower bound converges.…”
Section: Two Lda Implementations: Sampling and Vb Approachesmentioning
confidence: 99%
“…Assigning all words in a tweet to a single topic could increase word overlaps and thus result in mixed topics. 2) Multiple posts are combined into a virtual document [2,11], also known as the pooling strategy (e.g. tweets from a single user are combined into a single document [2]).…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations