Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion parameters, whose values greatly in uence the overall retrieval effectiveness. Traditionally, these parameters are set at a system level based on training queries, and the same parameters are then used for di erent queries. We observe that it may not be easy to set all these parameters separately, since they can be dependent. In addition, a global setting for all queries may not best t all individual queries with di erent characteristics. The parameters should be set according to these characteristics. In this article, we propose a novel approach to tackle this problem by dealing with the entire system con gurations (i.e., a set of parameters representing a n IR system b ehaviour) instead of selecting a single parameter at a time. The selection of the best con guration i s c ast a s a p roblem o f r anking di erent possible con gurations given a query. We apply learning-to-rank approaches for this task. We exploit both the query features and the system con guration f eatures i n t he l earning-to-rank m ethod s o t hat the selection of con guration i s q uery d ependent. T he e xperiments w e c onducted o n f our T REC a d h oc collections show that this approach can signi cantly outperform the traditional m ethod t o tune system conguration g lobally ( i.e., g rid s earch) a nd l eads t o h igher e ectiveness th an th e to p pe rforming sy stems of the TREC tracks. We also perform an ablation analysis on the impact of di erent f eatures o n t he model learning capability and show that query expansion features are among the most important for adaptive systems.The study presented in this article is built on the results and conclusions of the previous descriptive analysis studies but moves a step further by performing a predictive analysis: We investigate how system parameters can be set to t a given query, i.e., a query-dependent setting of system parameters. We assume that some parameters of the system can be set on the y at querying time, and a retrieval system allows us to set di erent values for the parameters easily. This is indeed the case for most IR systems nowadays. Retrieval platforms such as Terrier 4 [61], Lemur 5 [70], or Lucene 6 [53] allow us to set parameters for the retrieval step. For example, one may choose between several retrieval models (e.g., BM25, language models), di erent query expansion schemes, and so on. We target this group of parameters that can be set at query time. In contrast, we assume that an IR system has already built an index that cannot be changed easily. For example, it would be di cult to choose between di erent stemmers at query time, unless we construct several indexes using di erent stemmers. We exclude these parameters that cannot be set at query time in this study.The problem we tackle in this article is query-dependent param...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.