2014
DOI: 10.3166/dn.17.1.61-84
|View full text |Cite
|
Sign up to set email alerts
|

Accurate and effective latent concept modeling for ad hoc information retrieval

Abstract: ABSTRACT. A keyword query is the representation of the information need of a user, and is the result of a complex cognitive process which often results in under-specification. We propose an unsupervised method namely Latent Concept Modeling (LCM) for mining and modeling latent search concepts in order to recreate the conceptual view of the original information need. We use Latent Dirichlet Allocation (LDA) to exhibit highly-specific query-related topics from pseudo-relevant feedback documents. We define these … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
253
0
4

Year Published

2018
2018
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 419 publications
(284 citation statements)
references
References 39 publications
2
253
0
4
Order By: Relevance
“…Similarly to Arun et al (2010), Deveaud et al (2014) The last information we compiled regarding the number of topics is the perplexity, a common strategy to evaluate an LDA fitted model. The perplexity is a metric resulting from the comparison of probability models that assess how well a probability distribution predicts a sample.…”
Section: Selecting the Number Of Topics: Cross-validation Analysesmentioning
confidence: 99%
“…Similarly to Arun et al (2010), Deveaud et al (2014) The last information we compiled regarding the number of topics is the perplexity, a common strategy to evaluate an LDA fitted model. The perplexity is a metric resulting from the comparison of probability models that assess how well a probability distribution predicts a sample.…”
Section: Selecting the Number Of Topics: Cross-validation Analysesmentioning
confidence: 99%
“…We tested three statistical methods to find the best number of topics: (1) Arun2010 [26], (2) Cao2009 [27], and (3) Deveaud2014 [28]. However, these methods did not converge on our Twitter corpus.…”
Section: Step 3: Topic Modelingmentioning
confidence: 99%
“…The number of topics in our LDA model was selected using the optimization method proposed by Deveaud, SanJuan, and Bellot (2014). The number of topics in our LDA model was selected using the optimization method proposed by Deveaud, SanJuan, and Bellot (2014).…”
Section: Topic Modellingmentioning
confidence: 99%
“…Using the R package "ldatuning" (Murzintcev, 2014), we created 50 different LDA models by varying the K-parameter from 1 to 50. The number of topics in our LDA model was selected using the optimization method proposed by Deveaud, SanJuan, and Bellot (2014). The final LDA "best" model was fitted using the R package "topicmodels" (Hornik & Grün, 2011).…”
Section: Topic Modellingmentioning
confidence: 99%