When a Web user's underlying information need is not clearly specified from the initial query, an effective approach is to diversify the results retrieved for this query. In this paper, we introduce a novel probabilistic framework for Web search result diversification, which explicitly accounts for the various aspects associated to an underspecified query. In particular, we diversify a document ranking by estimating how well a given document satisfies each uncovered aspect and the extent to which different aspects are satisfied by the ranking as a whole. We thoroughly evaluate our framework in the context of the diversity task of the TREC 2009 Web track. Moreover, we exploit query reformulations provided by three major Web search engines (WSEs) as a means to uncover different query aspects. The results attest the effectiveness of our framework when compared to state-of-the-art diversification approaches in the literature. Additionally, by simulating an upper-bound query reformulation mechanism from official TREC data, we draw useful insights regarding the effectiveness of the query reformulations generated by the different WSEs in promoting diversity.
Abstract. The prediction of query performance is an interesting and important issue in Information Retrieval (IR). Current predictors involve the use of relevance scores, which are time-consuming to compute. Therefore, current predictors are not very suitable for practical applications. In this paper, we study a set of predictors of query performance, which can be generated prior to the retrieval process. The linear and non-parametric correlations of the predictors with query performance are thoroughly assessed on the TREC disk4 and disk5 (minus CR) collections. According to the results, some of the proposed predictors have significant correlation with query performance, showing that these predictors can be useful to infer query performance in practical applications.
Venue recommendation systems aim to effectively rank a list of interesting venues users should visit based on their historical feedback (e.g. checkins). Such systems are increasingly deployed by Locationbased Social Networks (LBSNs) such as Foursquare and Yelp to enhance their usefulness to users. Recently, various RNN architectures have been proposed to incorporate contextual information associated with the users' sequence of checkins (e.g. time of the day, location of venues) to effectively capture the users' dynamic preferences. However, these architectures assume that different types of contexts have an identical impact on the users' preferences, which may not hold in practice. For example, an ordinary contextsuch as the time of the day-reflects the user's current contextual preferences, whereas a transition context-such as a time interval from their last visited venue-indicates a transition effect from past behaviour to future behaviour. To address these challenges, we propose a novel Contextual Attention Recurrent Architecture (CARA) that leverages both sequences of feedback and contextual information associated with the sequences to capture the users' dynamic preferences. Our proposed recurrent architecture consists of two types of gating mechanisms, namely 1) a contextual attention gate that controls the influence of the ordinary context on the users' contextual preferences and 2) a time-and geo-based gate that controls the influence of the hidden state from the previous checkin based on the transition context. Thorough experiments on three large checkin and rating datasets from commercial LBSNs demonstrate the effectiveness of our proposed CARA architecture by significantly outperforming many state-of-the-art RNN architectures and factorisation approaches.
Dynamic pruning strategies permit efficient retrieval by not fully scoring all postings of the documents matching a query -without degrading the retrieval effectiveness of the topranked results. However, the amount of pruning achievable for a query can vary, resulting in queries taking different amounts of time to execute. Knowing in advance the execution time of queries would permit the exploitation of online algorithms to schedule queries across replicated servers in order to minimise the average query waiting and completion times. In this work, we investigate the impact of dynamic pruning strategies on query response times, and propose a framework for predicting the efficiency of a query. Within this framework, we analyse the accuracy of several query efficiency predictors across 10,000 queries submitted to in-memory inverted indices of a 50-million-document Web crawl. Our results show that combining multiple efficiency predictors with regression can accurately predict the response time of a query before it is executed. Moreover, using the efficiency predictors to facilitate online scheduling algorithms can result in a 22% reduction in the mean waiting time experienced by queries before execution, and a 7% reduction in the mean completion time experienced by users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.