Jonathan L. Elsas scite author profile

Blog feed search poses different and interesting challenges from traditional ad hoc document retrieval. The units of retrieval, the blogs, are collections of documents, the blog posts. In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task [12]. We also show that typical query expansion techniques such as pseudo-relevance feedback using the blog corpus do not provide any significant performance improvement and in many cases dramatically hurt performance. We perform an in-depth analysis of the behavior of pseudorelevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia. This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated algorithms respectively.

show abstract

Leveraging temporal dynamics of document content in relevance ranking

Elsas

Dumais

2010

View full text Add to dashboard Cite

Many web documents are dynamic, with content changing in varying amounts at varying frequencies. However, current document search algorithms have a static view of the document content, with only a single version of the document in the index at any point in time. In this paper, we present the first published analysis of using the temporal dynamics of document content to improve relevance ranking. We show that there is a strong relationship between the amount and frequency of content change and relevance. We develop a novel probabilistic document ranking algorithm that allows differential weighting of terms based on their temporal characteristics. By leveraging such content dynamics we show significant performance improvements for navigational queries.

show abstract

Fast learning of document ranking functions with the committee perceptron

Elsas

Carvalho

Carbonell

2008

View full text Add to dashboard Cite

This paper presents a new variant of the perceptron algorithm using selective committee averaging (or voting). We apply this agorithm to the problem of learning ranking functions for document retrieval, known as the "Learning to Rank" problem. Most previous algorithms proposed to address this problem focus on minimizing the number of misranked document pairs in the training set. The committee perceptron algorithm improves upon existing solutions by biasing the final solution towards maximizing an arbitrary rank-based performance metrics. This method performs comparably or better than two state-of-the-art rank learning algorithms, and also provides significant training time improvements over those methods, showing over a 45-fold reduction in training time compared to ranking SVM.

show abstract

Rank learning for factoid question answering with linguistic and semantic constraints

Bilotti

Elsas

Carbonell

et al. 2010

View full text Add to dashboard Cite

This work presents a general rank-learning framework for passage ranking within Question Answering (QA) systems using linguistic and semantic features. The framework enables query-time checking of complex linguistic and semantic constraints over keywords. Constraints are composed of a mixture of keyword and named entity features, as well as features derived from semantic role labeling. The framework supports the checking of constraints of arbitrary length relating any number of keywords. We show that a trained ranking model using this rich feature set achieves greater than a 20% improvement in Mean Average Precision over baseline keyword retrieval models. We also show that constraints based on semantic role labeling features are particularly effective for passage retrieval; when they can be leveraged, an 40% improvement in MAP over the baseline can be realized.

show abstract

Accessing government statistical information

et al. 2005

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jonathan L. Elsas

Retrieval and feedback models for blog feed search

Leveraging temporal dynamics of document content in relevance ranking

Fast learning of document ranking functions with the committee perceptron

Rank learning for factoid question answering with linguistic and semantic constraints

Accessing government statistical information

Contact Info

Product

Resources

About