2018
DOI: 10.1177/1536867x1801800107
|View full text |Cite
|
Sign up to set email alerts
|

Ldagibbs: A Command for Topic Modeling in Stata Using Latent Dirichlet Allocation

Abstract: In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0
6

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 56 publications
(54 citation statements)
references
References 10 publications
0
48
0
6
Order By: Relevance
“…As explained in Schwarz (2018), Latent Dirichlet Allocation (LDA) is composed of two parts. The first is a probabilistic model describing the text data as a likelihood function.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…As explained in Schwarz (2018), Latent Dirichlet Allocation (LDA) is composed of two parts. The first is a probabilistic model describing the text data as a likelihood function.…”
Section: Methodsmentioning
confidence: 99%
“…The authors employ Latent Dirichlet Allocation (LDA), a machine learning algorithm developed by Blei et al (2003) which allows for the automatic clustering of 1,892 ECB Board speeches. The aim of the current paper is to introduce the LDA methodology as presented in Schwarz (2018) and obtain results using the Idagibbs Stata command. We analyze 45,346 entries or passages of the Federal Open Market Committee (FOMC) during the period 2003-2012, with the goal of identifying the evolution of the different topics discussed by the members of the FOMC.…”
Section: Introductionmentioning
confidence: 99%
“…Its performance will be tested by processing real-time information of the Twitter platform via the Twitter API, and more advanced text preprocessing techniques will be incorporated to improve the quality of classification of the sentiment. Further enhancements include the portability to other platforms such as mobile devices and the incorporation of Latent Dirichlet Allocation (LDA) technique to automatically identify the number of attributes of corpus [57][58][59][60].…”
Section: Discussionmentioning
confidence: 99%
“…Depolanan verinin çok büyük oranda metin verisi olduğu göz önüne alındığında, metin verilerinin otomatik analizi oldukça önemli bir araştırma problemi haline dönüşmektedir. Bu doğrultuda, [1][2][3][4]. Dokümanlar gibi ayrık verileri modellemek için geliştirilen üretici grafiksel bir model olan GDA dokümanı oluşturan gizli konuları ortaya çıkarmaktadır [5].…”
Section: Introductionunclassified