2013
DOI: 10.1007/978-3-642-40991-2_36
|View full text |Cite
|
Sign up to set email alerts
|

Nested Hierarchical Dirichlet Process for Nonparametric Entity-Topic Analysis

Abstract: Abstract. The Hierarchical Dirichlet Process (HDP) is a Bayesian nonparametric prior for grouped data, such as collections of documents, where each group is a mixture of a set of shared mixture densities, or topics, where the number of topics is not fixed, but grows with data size. The Nested Dirichlet Process (NDP) builds on the HDP to cluster the documents, but allowing them to choose only from a set of specific topic mixtures. In many applications, such a set of topic mixtures may be identified with the set… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
4
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…Importantly, the HHDP overcomes the above-mentioned theoretical, modeling, and computational limitations since it, respectively, does not suffer from the degeneracy flaw, is able to effectively capture different weights of shared clusters and allows to handle several populations as showcased in the real data application. Note that the idea of the model was first hinted at in James (2008) and, later, considered in Agrawal et al (2013) from a mere computational point of view without providing results on distributional properties that are relevant for Bayesian inference. Hence, as a by-product, our theoretical results shed also some light on the topic modeling applications of Agrawal et al (2013).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Importantly, the HHDP overcomes the above-mentioned theoretical, modeling, and computational limitations since it, respectively, does not suffer from the degeneracy flaw, is able to effectively capture different weights of shared clusters and allows to handle several populations as showcased in the real data application. Note that the idea of the model was first hinted at in James (2008) and, later, considered in Agrawal et al (2013) from a mere computational point of view without providing results on distributional properties that are relevant for Bayesian inference. Hence, as a by-product, our theoretical results shed also some light on the topic modeling applications of Agrawal et al (2013).…”
Section: Introductionmentioning
confidence: 99%
“…Note that the idea of the model was first hinted at in James (2008) and, later, considered in Agrawal et al (2013) from a mere computational point of view without providing results on distributional properties that are relevant for Bayesian inference. Hence, as a by-product, our theoretical results shed also some light on the topic modeling applications of Agrawal et al (2013). Additionally, the same model was independently applied in Balocchi et al (2021) to successfully cluster urban areal units at different levels of resolution simultaneously.…”
Section: Introductionmentioning
confidence: 99%
“…The models developed by Denti et al (2021) and Beraha et al (2021) similarly address the flexibility constraint in the nDP and combine the nDP with models suitable for grouped data, but different from the HDP. Agrawal et al (2013), Wulsin et al (2016) and Nguyen et al (2014) use a model that is essentially equal to the one used in this work, but apply it in different contexts; moreover Nguyen et al (2014) consider the particular case where additional group-level data is available. Additionally, the same model was independently studied in Lijoi et al (2020), although their focus on the theoretical aspects of the model.…”
Section: Introductionmentioning
confidence: 99%
“…This is related to the dual-HDP described inWang et al (2009) and the single-entity model ofAgrawal et al (2013), for example, although these works tended to focus on textual topic modelling.…”
mentioning
confidence: 99%