Most information retrieval (IR) systems consider relevance, usefulness, and quality of information objects (documents, queries) for evaluation, prediction, and recommendation, often ignoring the underlying search process of information seeking. This may leave out opportunities for making recommendations that analyze the search process and/or recommend alternative search process instead of objects. To overcome this limitation, we investigated whether by analyzing a searcher's current processes we could forecast his likelihood of achieving a certain level of success with respect to search performance in the future. We propose a machine-learning-based method to dynamically evaluate and predict search performance several time-steps ahead at each given time point of the search process during an exploratory search task. Our prediction method uses a collection of features extracted from expression of information need and coverage of information. For testing, we used log data collected from 4 user studies that included 216 users (96 individuals and 60 pairs). Our results show 80-90% accuracy in prediction depending on the number of time-steps ahead. In effect, the work reported here provides a framework for evaluating search processes during exploratory search tasks and predicting search performance. Importantly, the proposed approach is based on user processes and is independent of any IR system.
IntroductionPredicting how people perform in their information search processes is a hard problem. The prediction problem becomes even more complex when considering exploratory searches. Exploratory search is typically described as openended and multifaceted, where the goals may be unclear and there may be no or multiple satisfactory answers ( Marchionini, 2006;Wildemuth & Freund, 2012). Searchers engage in exploratory search when they commence researching a new topic, when they form a new idea, in problem identification, and other forms of information seeking for creative discovery. The searcher shifts out of an exploratory search when they have defined the issues and problems in the new topic area and are now sure of their information need. MacKay (1969) and Taylor (1968) explain this difference in terms of issuing a command to the information system and asking a question of the information system. A command from the user wishes to affect the "goalsetting" levers in the information system, while a question from the user wishes the system to affect the goal-settings, the range of state of readiness, in the user (MacKay, 1969, p. 101; see also, Cole, 2012, p. 20).Although modern information retrieval (IR) systems aim to support exploratory search, such systems are often unable to perform dynamic and timely predictions of their users' search performance. Exploratory search as a combination of browsing and focused search involves a variety of key factors such as context, intentions, motivations, prior knowledge, feelings, expectations, and strategies, which are often ignored by these IR systems. In the absence of such informati...