Ronald T. Fernández scite author profile

Employing effective methods of sentence retrieval is essential for many tasks in Information Retrieval, such as summarization, novelty detection and question answering. The best performing sentence retrieval techniques attempt to perform matching directly between the sentences and the query. However, in this paper, we posit that the local context of a sentence can provide crucial additional evidence to further improve sentence retrieval. Using a Language Modeling Framework, we propose a novel reformulation of the sentence retrieval problem that extends previous approaches so that the local context is seamlessly incorporated within the retrieval models. In a series of comprehensive experiments, we show that localized smoothing and the prior importance of a sentence can improve retrieval effectiveness. The proposed models significantly and substantially outperform the state of the art and other competitive sentence retrieval baselines on recall-oriented measures, while remaining competitive on precision-oriented measures. This research demonstrates that local context plays an important role in estimating the relevance of a sentence, and that existing sentence retrieval language models can be extended to utilize this evidence effectively.

Highly Frequent Terms and Sentence Retrieval

Fernández

In this paper we propose a novel sentence retrieval method based on extracting highly frequent terms from top retrieved documents. We compare it against state of the art sentence retrieval techniques, including those based on pseudo-relevant feedback, showing that the approach is robust and competitive. Our results reinforce the idea that top retrieved data is a valuable source to enhance retrieval systems. This is especially true for short queries because there are usually few querysentence matching terms. Moreover, the approach is particularly promising for weak queries. We demonstrate that this novel method is able to improve significantly the precision at top ranks when handling poorly specified information needs.

Using opinion-based features to boost sentence retrieval

Fernández

2009

Opinion mining has become recently a major research topic. A wide range of techniques have been proposed to enable opinion-oriented information seeking systems. However, little is known about the ability of opinion-related information to improve regular retrieval tasks. Our hypothesis is that standard retrieval methods might benefit from the inclusion of opinion-based features. A sentence retrieval scenario is a natural choice to evaluate this claim. We propose here a formal method to incorporate some opinion-based features of the sentences as query-independent evidence. We show that this incorporation leads to retrieval methods whose performance is significantly better than the performance of state of the art sentence retrieval models.

Seeding simulated queries with user-study data for personal search evaluation

Elsweiler

Toucedo

et al. 2011

In this paper we perform a lab-based user study (n=21) of email re-finding behaviour, examining how the characteristics of submitted queries change in different situations. A number of logistic regression models are developed on the query data to explore the relationship between userand contextual-variables and query characteristics including length, field submitted to and use of named entities. We reveal several interesting trends and use the findings to seed a simulated evaluation of various retrieval models. Not only is this an enhancement of existing evaluation methods for Personal Search, but the results show that different models are more effective in different situations, which has implications both for the design of email search tools and for the way algorithms for Personal Search are evaluated.

Novelty detection using local context analysis

Fernández

2007