UMass at TREC 2004: Novelty and HARD

AbdulJaleel, Nasreen; Allan, James; Croft, W. Bruce; Dı́az, Fernando; Larkey, Leah S.; Li, Xiaoyan; Smucker, Mark D.; Wade, Courtney

doi:10.21236/ada460118

Cited by 191 publications

(224 citation statements)

References 33 publications

Supporting

Mentioning

221

Contrasting

Order By: Relevance

“…This interpolated language model can then be used with Equation 4 to rank documents (Abdul-Jaleel et al, 2004). We will refer to this as the expanded query score of a document.…”

Section: Query Expansion With Word Embeddingsmentioning

confidence: 99%

Query Expansion with Locally-Trained Word Embeddings

Dı́az

Mitra

Craswell

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Self Cite

195

174

View full text Add to dashboard Cite

Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for ad hoc information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks benefiting from global embeddings may also benefit from local embeddings.

show abstract

“…This interpolated language model can then be used with Equation 4 to rank documents (Abdul-Jaleel et al, 2004). We will refer to this as the expanded query score of a document.…”

Section: Query Expansion With Word Embeddingsmentioning

confidence: 99%

Query Expansion with Locally-Trained Word Embeddings

Dı́az

Mitra

Craswell

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Self Cite

195

174

View full text Add to dashboard Cite

show abstract

“…In [14] they compared five methods to estimate the query language models: RM3 and RM4 [1]; a divergence minimization model (DMM) and a simple mixture model (SMM) [23]; and a regularized mixture model (RMM) [20]. The main finding of this paper was that, in general, RM3 is the best and most stable method among the others.…”

Section: Introductionmentioning

confidence: 93%

“…The main finding of this paper was that, in general, RM3 is the best and most stable method among the others. RM3 and RM4 [1] are extensions of the originally formulated RM1 and RM2 approximations, respectively. These extensions linearly interpolate the original query with the terms selected for expansion using RM1 or RM2.…”

Section: Introductionmentioning

confidence: 99%

Promoting Divergent Terms in the Estimation of Relevance Models

Parapar

Barreiro

2011

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Traditionally the use of pseudo relevance feedback (PRF) techniques for query expansion has been demonstrated very effective. Particularly the use of Relevance Models (RM) in the context of the Language Modelling framework has been established as a high-performance approach to beat. In this paper we present an alternative estimation for the RM promoting terms that being present in the relevance set are also distant from the language model of the collection. We compared this approach with RM3 and with an adaptation to the Language Modelling framework of the Rocchio's KLD-based term ranking function. The evaluation showed that this alternative estimation of RM reports consistently better results than RM3, showing in average to be the most stable across collections in terms of robustness.

show abstract

“…F-score Blott et al (2004) / Gamon (2006 0.622 Tomiyama et al (2004) 0.619 Abdul-Jaleel et al (2004) 0.618 Schiffman and McKeown (2004) 0 …”

Section: Systemsmentioning

confidence: 99%

Online Sentence Novelty Scoring for Topical Document Streams

Lee

2015

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

The enormous amount of information on the Internet has raised the challenge of highlighting new information in the context of already viewed content. This type of intelligent interface can save users time and prevent frustration. Our goal is to scale up novelty detection to large web properties like Google News and Yahoo News. We present a set of lightweight features for online novelty scoring and fast nonlinear feature transformation methods.Our experimental results on the TREC 2004 shared task datasets show that the proposed method is not only efficient but also very powerful, significantly surpassing the best system at TREC 2004.

show abstract

UMass at TREC 2004: Novelty and HARD

Cited by 191 publications

References 33 publications

Query Expansion with Locally-Trained Word Embeddings

Query Expansion with Locally-Trained Word Embeddings

Promoting Divergent Terms in the Estimation of Relevance Models

Online Sentence Novelty Scoring for Topical Document Streams

Contact Info

Product

Resources

About