2013
DOI: 10.1007/s10994-013-5413-0
|View full text |Cite
|
Sign up to set email alerts
|

Interactive topic modeling

Abstract: Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. In this work, we develop a framework for allowing users to iteratively refine the topics discovered by models such as latent Dirichlet allocation (LDA) by adding constraints that enforce … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
189
0
2

Year Published

2015
2015
2019
2019

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 253 publications
(191 citation statements)
references
References 31 publications
0
189
0
2
Order By: Relevance
“…The generative process of nl-cLDA is as follows. It is essentially the same as (Hu et al, 2014) 1. For each topic k …”
Section: A Dataset Preprocessingmentioning
confidence: 99%
See 1 more Smart Citation
“…The generative process of nl-cLDA is as follows. It is essentially the same as (Hu et al, 2014) 1. For each topic k …”
Section: A Dataset Preprocessingmentioning
confidence: 99%
“…• nI-cLDA, non-interactive constrained Latent Dirichlet Allocatoin, a variant of ITM (Hu et al, 2014), where constraints are inferred by applying k-means to external word embeddings. Each resulting word cluster is then regarded as a constraint.…”
Section: Datasets and Models Descriptionmentioning
confidence: 99%
“…Our tool is complementary to this large body of work, and supports real-world deployment of these techniques. Interactive topic modeling (Hu et al, 2014) can play a key role to help users not only verify model consistency but actively curate high-quality codes; its inclusion is beyond the scope of a single conference paper. While supervised learning (Settles, 2011) has been applied to content analysis, it represents the application of a pre-defined coding scheme to a text corpus, which is different from the task of devising a coding scheme and assessing its reliability.…”
Section: Reproducibility Of a Coding Processmentioning
confidence: 99%
“…Topic models like LDA rely on parameters that, while there are methods for doing so, can not easily be estimated through computation alone. Often, some emerging topics will be nonsensical to a human user [9]. Through interactivity, a topic model can be guided towards achieving more meaningful results.…”
Section: Topic Models and Interactivitymentioning
confidence: 99%
“…Here, the FOL constraints are similar to the must link and cannot link constraints of [3], but defined on word-pairs rather than documents. In some cases, real-time interactive knowledge injection has been applied, such as in [9], where the authors have used similar concepts as in [12] to create a framework allowing users to iteratively and interactively improve topic modeling results. While the work in [12] and [9] are general-purpose solutions, many of the specialised variations of LDA which incorporate domain knowledge are custom-built, single-purpose methods.…”
Section: A Human Knowledge Injection In Topic Modelsmentioning
confidence: 99%