Modeling multidimensional relevance in information retrieval (IR) has attracted much attention in recent years. However, most existing studies are conducted through relatively small-scale user studies, which may not reflect a real-world and natural search scenario. In this article, we propose to study the multidimensional user relevance model (MURM) on large scale query logs, which record users' various search behaviors (e.g., query reformulations, clicks and dwelling time, etc.) in natural search settings. We advance an existing MURM model (including five dimensions: topicality, novelty, reliability, understandability, and scope) by providing two additional dimensions, that is, interest and habit. The two new dimensions represent personalized relevance judgment on retrieved documents. Further, for each dimension in the enriched MURM model, a set of computable features are formulated. By conducting extensive document ranking experiments on Bing's query logs and TREC session Track data, we systematically investigated the impact of each dimension on retrieval performance and gained a series of insightful findings which may bring benefits for the design of future IR systems.
IntroductionThere have been numerous attempts (Tombros, Ruthven, & Jose, 2005;Xu & Yin, 2008;Xu & Chen, 2006;Zhang, Zhang, Lease, & Gwizdka, 2014) to understand users' search behaviors when retrieving information with search engines, foe example, relevance judgment, satisfaction or dissatisfaction with search results. Understanding how users conduct relevance judgment and what factors influence users' satisfaction with the search results would help researchers design more effective retrieval models and better evaluation methodologies, aiming to further improve users' search experience.Judging the relevance (or utility) of a retrieved document with respect to a user issued query (representing the user's current information need) is a central task for search engines. A large number of studies (Barry, 1998;Tombros et al., 2005;Xu & Yin, 2008;Xu & Chen, 2006;Zhang et al., 2014) have revealed that there exist a range of complex factors (e.g., topicality, novelty, reliability, understandability, and scope) affecting users' perception of relevance for the retrieved documents. However, the existing work is mainly based on small scale user studies, which may not reflect users' natural search scenarios, and the relevance judgments involved were made in a static way that cannot capture the dynamics of a user's information need, search interest, and habit.To address aforementioned limitations, in this article, we propose to study and understand the multidimensional relevance through analyzing real query logs that record real world user interactions (e.g., query reformulations, clicks, and dwelling time, etc.) with the search engine, as an important supplement to the numerous existing work based on user studies. Specifically, we analyze how different factors affect the users' perception of relevance on retrieved documents within the framework of the Multidi...