Improving web search ranking by incorporating user behavior information

Agichtein, Eugene; Brill, Eric; Dumais, Susan T.

doi:10.1145/1148170.1148177

Cited by 828 publications

(460 citation statements)

References 16 publications

Supporting

Mentioning

455

Contrasting

Unclassified

Order By: Relevance

“…The underlying assumption was that a result with a larger amount of clicks is more relevant to the query than a result with fewer clicks. Agichtein et al (2006) proposed an idea of aggregating information from many unreliable user search sessions, instead of treating each user as a reliable expert to predict user relevance assessment of search results. Dou et al (2008) used aggregate click-through logs to learn the ranking of search results, and found that the aggregation of a large number of user clicks is indicative of relevance preferences.…”

Section: "Wisdom Of Crowds" Techniques and Information Retrievalmentioning

confidence: 99%

Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankings

Zhitomirsky‐Geffet

Bar‐Ilan

Levene

2016

Aslib Journal of Information Management

View full text Add to dashboard Cite

Purpose: One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies that individual users might change their assessment of search results over time. It is also known that aggregated judgments of multiple individual users can lead to correct and reliable decisions; this phenomenon is known as the "wisdom of crowds". The aim of this study is to examine whether aggregated judgments will be more stable and thus more reliable over time than individual user judgments.Design/Methods: In this study two simple measures are proposed to calculate the aggregated judgments of search results and compare their reliability and stability to individual user judgments. In addition, the aggregated "wisdom of crowds" judgments were used as a means to compare the differences between human assessments of search results and search engine's rankings. A large-scale user study was conducted with 87 participants who evaluated two different queries and four diverse result sets twice, with an interval of two months. Two types of judgments were considered in this study: 1) relevance on a 4-point scale, and 2) ranking on a 10-point scale without ties.Findings: It was found that aggregated judgments are much more stable than individual user judgments, yet they are quite different from search engine rankings. Practical implications:The proposed "wisdom of crowds" based approach provides a reliable reference point for the evaluation of search engines. This is also important for exploring the need of personalization and adapting search engine's ranking over time to changes in users preferences. Originality/Value: This is a first study that applies the notion of "wisdom of crowds" to examine the under-explored phenomenon in the literature of "change in time" in user evaluation of relevance.Keywords: ranking, relevance judgment, change in time, wisdom of crowds Research paper IntroductionNumerous general models of information seeking and web searching behaviour have been proposed in the past such as (Ellis, 1989;Bates, 1989; Kuhlthau, 1991; Dervin, 1992; Johnson and Meishke, 1993;Marchionini, 1995; Spink, 1997; Wilson, 1999; Fisher et al., 2005;Knight and Spink, 2008; Du and Spink, 2010;Case, 2012). Relevance is a central notion in information science and is an important part of user information seeking models (Saracevic, 2007). This study investigates an under-explored topic in the literature (Saracevic, 2007): stability and change of user assessment of search results over time. Human evaluation of documents relevance is a complex process that requires coordination of multiple cognitive tasks (Du and Spink, 2011). User result evaluation is needed in many fields and has many purposes, hence it is important to understand the factors and phenomena behind it. This study aims to extend the understanding of the result evaluation component of the proposed web search behavior models, with respect to the temporal change factor. In this b...

show abstract

Section: "Wisdom Of Crowds" Techniques and Information Retrievalmentioning

confidence: 99%

Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankings

Zhitomirsky‐Geffet

Bar‐Ilan

Levene

2016

Aslib Journal of Information Management

View full text Add to dashboard Cite

show abstract

“…In (Agichtein, Brill, & Dumais, 2006) by using many click-through data features as user feedback in the ranking process both directly and indirectly, they found very interesting results. They used 3,000 user queries for evaluation and found a 31% increase in ranking quality in comparison to other ranking algorithms.…”

Section: Background and Related Workmentioning

confidence: 99%

A3CRank: An adaptive ranking method based on connectivity, content and click-through data

Bidoki¹,

Ghodsnia²,

Yazdani³

et al. 2010

Information Processing & Management

View full text Add to dashboard Cite

Due to the proliferation and abundance of information on the web, ranking algorithms play an important role in web search. Currently, there are some ranking algorithms based on content and connectivity such as PageRank and BM25. Unfortunately, these algorithms have low precision and are not always satisfying for users. In this paper, we propose an adaptive method based on the content, connectivity and click-through data triple, called A3CRank. The aggregation idea of meta search engines has been used to aggregate ranking algorithms such as PageRank, BM25, TF-IDF. We have used reinforcement learning to incorporate user behavior and find a measure of user satisfaction for each ranking algorithm. Furthermore, OWA, an aggregation operator is used for merging the results of the various ranking algorithms. A3CRank adapts itself with user needs and makes use of user clicks to aggregate the results of ranking algorithms. A3Crank is designed to overcome some of the shortcomings of existing ranking algorithms by combining them together and producing an overall better ranking criterion. Experimental results indicate that A3CRank outperforms all other single ranking algorithms in P@n and NDCG measures. We have used 130 queries on University of California at Berkeley's web to train and evaluate our method.

show abstract

“…Recent studies show that users tend to click on the documents in the first ranks of the ranking list. The study results of Agichtein et al on frequency distribution of relevance of users' clicks on a web search results have shown that the relative number of clicks on documents decreases with lower rank [3], and users click on documents in the second, third and fourth rank, respectively, with probability about 60%, 50% and 30% [4]. This shows that users click on high rank documents in the results, even if the documents are irrelevant.…”

Section: Introductionmentioning

confidence: 96%

Web pages ranking algorithm based on reinforcement learning and user feedback

Derhami

Paksima

Khajeh

2015

JAIDM

View full text Add to dashboard Cite

The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement learning and user`s feedback called RL3F are considered. In the proposed algorithm, the ranking system has been considered to be the agent of learning system and selecting documents to display to the user is as the agents' action. The reinforcement signal in the system is calculated according to a user`s clicks on documents. Action-value values of the proposed algorithm are computed for each feature. In each learning cycle, the documents are sorted out for the next query, and according to the document in the ranked list, documents are selected at random to show the user. Learning process continues until the training is completed. LETOR3 benchmark is used to evaluate the proposed method. Evaluation results indicated that the proposed method is more effective than other methods mentioned for comparison in this paper. The superiority of the proposed algorithm is using several features of document and user`s feedback simultaneously.

show abstract

Improving web search ranking by incorporating user behavior information

Cited by 828 publications

References 16 publications

Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankings

Testing the stability of “wisdom of crowds” judgments of search results over time and their similarity with the search engine rankings

A3CRank: An adaptive ranking method based on connectivity, content and click-through data

Web pages ranking algorithm based on reinforcement learning and user feedback

Contact Info

Product

Resources

About