“…Normally, evaluating an IR system requires experimental sets containing queries, documents, and relevance judgments; however, building such collections requires a significant amount of work (in other words, data on queries and judgments). Thus, in many recent studies ( Gayo-Avello & Brenes, 2009;Joachims, 2003;Jung, Herlocker, & Webster, 2007;Liu, Fu, Zhang, Ma, & Ru, 2007;Zareh Bidoki et al, 2010 ), click-through data were employed to evaluate search engines' performance. The concept is simple: employ clicks as relevance judgments, assuming that a user evaluates a result as relevant if it is chosen among the search results related to a query.…”