Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval 2013
DOI: 10.1145/2484028.2484063
|View full text |Cite
|
Sign up to set email alerts
|

Pseudo test collections for training and tuning microblog rankers

Abstract: Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments;… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
2
2

Relationship

3
5

Authors

Journals

citations
Cited by 22 publications
(29 citation statements)
references
References 36 publications
0
29
0
Order By: Relevance
“…Or should we frame this more positively as an opportunity for research on generating training material or even simulation, as has previously been pursued for, e.g., learning to rank (Liu 2009), see, e.g., Azzopardi et al (2007), Berendsen et al (2013b)? There is also an important contrast to note here between supervised scenarios, such as learning to rank versus unsupervised learning of word embeddings or typical queries (see Mitra 2015;Mitra and Craswell 2015;Sordoni et al 2015;Van Gysel et al 2016a, b).…”
Section: Resultsmentioning
confidence: 99%
“…Or should we frame this more positively as an opportunity for research on generating training material or even simulation, as has previously been pursued for, e.g., learning to rank (Liu 2009), see, e.g., Azzopardi et al (2007), Berendsen et al (2013b)? There is also an important contrast to note here between supervised scenarios, such as learning to rank versus unsupervised learning of word embeddings or typical queries (see Mitra 2015;Mitra and Craswell 2015;Sordoni et al 2015;Van Gysel et al 2016a, b).…”
Section: Resultsmentioning
confidence: 99%
“…This is not only time consuming and expensive, it is also not entirely clear how useful such relevance assessments really are as agreement with actual users' preferences is not necessarily high [23]. Several approaches to address the high costs of producing relevance assessments have been described [3,24]; even further steps are taken in [1,2], with proposals for automatically created test collections.…”
Section: Evaluation For Information Retrievalmentioning
confidence: 99%
“…However, there now are n teams that take turns. 2 This implies that, in case n is larger than the number of slots in the interleaved list, some teams may not be represented. Inferring which teams win is now done by counting the number of clicked documents for each team.…”
Section: Interleaving and Multileavingmentioning
confidence: 99%
See 1 more Smart Citation
“…This will allow us to see directly if our features are complementary to the other features. We opted for the L2R approach in [2] ("the UvA model"), because of its comprehensiveness. It uses pseudo-test collections [1] to learn to fuse ten well-established retrieval algorithms and implements a number of query, tweet, and query-tweet features.…”
Section: Experiments and Evaluationmentioning
confidence: 99%