Proceedings of the 2015 International Conference on the Theory of Information Retrieval 2015
DOI: 10.1145/2808194.2809471
|View full text |Cite
|
Sign up to set email alerts
|

Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Abstract: Pseudo-Relevance Feedback (PRF) is an important general technique for improving retrieval effectiveness without requiring any user effort. Several state-of-the-art PRF models are based on the language modeling approach where a query language model is learned based on feedback documents. In all these models, feedback documents are represented with unigram language models smoothed with a collection language model. While collection language model-based smoothing has proven both effective and necessary in using la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 16 publications
0
8
0
Order By: Relevance
“…Finally, α > 0 is the pseudo-count smoothing parameter. Motivated by a Bayesian interpretation of placing a Jeffrey's type Dirichlet prior over multinomial counts, we choose α = 0.5 (Hazimeh and Zhai, 2015;Valcarce et al, 2016;Manning et al, 2008). The quadratic loss is given by the following formula:…”
Section: Classifier Assessmentmentioning
confidence: 99%
“…Finally, α > 0 is the pseudo-count smoothing parameter. Motivated by a Bayesian interpretation of placing a Jeffrey's type Dirichlet prior over multinomial counts, we choose α = 0.5 (Hazimeh and Zhai, 2015;Valcarce et al, 2016;Manning et al, 2008). The quadratic loss is given by the following formula:…”
Section: Classifier Assessmentmentioning
confidence: 99%
“…The rapid development of language modeling (LM) provides favorable conditions for the development of effective PRF models (for instance, Ponte & Croft, 1998). A wide range of retrieval approaches based on LM have been proposed (for instance, Lavrenko & Croft, 2001;Lv & Zhai, 2009a;Song & Croft, 1999;Hazimeh & Zhai, 2015;Zhai, 2008), in which feedback documents are always exploited to reestimate a more accurate query language model. For example, Zhai and Lafferty (2001) presented a model-based feedback model, in which two different approaches were evaluated for updating a query language model based on feedback documents: one approach based on a generative probabilistic model of feedback documents and the other one based on the minimization of the KL divergence over feedback documents.…”
Section: Related Workmentioning
confidence: 99%
“…The other traditional class of models we should mention here is the relevance model (RM) framework, which is a wellknown LM-based retrieval framework. It has an intuitive probabilistic interpretation and has been proven to be effective in several empirical studies (for instance, Hazimeh & Zhai, 2015). Two assumptions are adopted in the RM framework: one is that each piece of information related to the topic has an underlying RM, which follows multinomial distribution over words, and the other is that the terms belonging to the query topic and the terms in the feedback documents are randomly sampled according to a distribution R. Generally, RMs could have different forms based on different estimation approaches, and these models do not model the relevant or pseudo-relevant documents in an explicit way.…”
Section: Adaptation Of Traditional Modelsmentioning
confidence: 99%
“…When performing pseudo-relevance feedback in retrieval, an axiomatic analysis of RM1 showed that additive smoothing is a better smoothing method than the others because it does not demote the IDF effect. 8 For collaborative filtering, relevance models work better with Absolute Discounting than with Dirichlet priors or Jelinek-Mercer. However, a posterior axiomatic analysis of RM2 for collaborative filtering showed that the IDF effect is related to item novelty in recommendation advocating for the use of additive smoothing in this setting.…”
Section: Additive Smoothing (A)mentioning
confidence: 99%
“…Hazimeh and Zhai studied formally the IDF effect on several state-of-the-art pseudo-relevance feedback techniques based on the language modelling framework (including relevance models). 8 The IDF effect is a heuristic that emphasizes the selection of documents with highly specific terms. They found that the selection of the smoothing method impacts the IDF effect.…”
Section: Introductionmentioning
confidence: 99%