Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing - SemiSupLearn '09 2009
DOI: 10.3115/1621829.1621835
|View full text |Cite
|
Sign up to set email alerts
|

Latent Dirichlet Allocation with topic-in-set knowledge

Abstract: Latent Dirichlet Allocation is an unsupervised graphical model which can discover latent topics in unlabeled data. We propose a mechanism for adding partial supervision, called topic-inset knowledge, to latent topic modeling. This type of supervision can be used to encourage the recovery of topics which are more relevant to user modeling goals than the topics which would be recovered otherwise. Preliminary experiments on text datasets are presented to demonstrate the potential effectiveness of this method.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
57
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 126 publications
(57 citation statements)
references
References 12 publications
0
57
0
Order By: Relevance
“…In this way, we can "word seed" a topic to a given set of words. A similar method has been used by Jagarlamudi et al (2012) and Andrzejewski & Zhu (2009), but unlike those previous approaches, we do not change the underlying model to incorporate this prior knowledge. Instead, we put a restriction on which topics the seeding words are allowed to belong to by using conditional Dirichlet distributions, conditioned on the probability being zero in all topics other than the seeded ones.…”
Section: Methodsmentioning
confidence: 99%
“…In this way, we can "word seed" a topic to a given set of words. A similar method has been used by Jagarlamudi et al (2012) and Andrzejewski & Zhu (2009), but unlike those previous approaches, we do not change the underlying model to incorporate this prior knowledge. Instead, we put a restriction on which topics the seeding words are allowed to belong to by using conditional Dirichlet distributions, conditioned on the probability being zero in all topics other than the seeded ones.…”
Section: Methodsmentioning
confidence: 99%
“…works, many variations have been proposed [1,2,4,6,9,10,26,27,29,30,32,37,40]. In this paper, we only focus on the variations that add supervised information in the form of latent topic assignments.…”
Section: Introductionmentioning
confidence: 99%
“…To the best of our knowledge, this is the first constrained LDA model which can process large scale constraints in the forms of must-links and cannot-links. There are two existing work by Andrzejewski and Zhu [1,2] that are related to the proposed model. However, [1] only considers must-link constraints.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations