Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval 2019
DOI: 10.1145/3331184.3331269
|View full text |Cite
|
Sign up to set email alerts
|

To Model or to Intervene

Abstract: Learning to Rank (LTR) from user interactions is challenging as user feedback often contains high levels of bias and noise. At the moment, two methodologies for dealing with bias prevail in the field of LTR: counterfactual methods that learn from historical data and model user behavior to deal with biases; and online methods that perform interventions to deal with bias but use no explicit user models. For practitioners the decision between either methodology is very important because of its direct impact on en… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 58 publications
(2 citation statements)
references
References 38 publications
0
2
0
Order By: Relevance
“…In order to evaluate these approaches effectively, we need a dataset containing logged feedback for context-action pairs, along with the logging propensity for the action performed. Related work has evaluated counterfactual learning methods on multi-class, multi-label or LTR tasks [13,17,47,48], synthetically generating bandit feedback samples for a certain logging policy and existing datasets. What makes the recommendation task fundamentally different from the aforementioned settings, is that access to the true labels (i.e.…”
Section: Resultsmentioning
confidence: 99%
“…In order to evaluate these approaches effectively, we need a dataset containing logged feedback for context-action pairs, along with the logging propensity for the action performed. Related work has evaluated counterfactual learning methods on multi-class, multi-label or LTR tasks [13,17,47,48], synthetically generating bandit feedback samples for a certain logging policy and existing datasets. What makes the recommendation task fundamentally different from the aforementioned settings, is that access to the true labels (i.e.…”
Section: Resultsmentioning
confidence: 99%
“…At the local scale, we exploit a multi-interest extractor module to learn representations of multiple interests in fine granularity from the corresponding subsequences discovered via intention prototype clustering. Encouragingly, the noisy behaviors (e.g., sales promotions, exposure bias [31], and position bias [8]) that are inconsistent with user's real interests will be filtered out when clustering. We further develop an interest aggregation module, which leverages the inherent preference to guide the multi-interests aggregation to generate the user's current interest.…”
Section: Introductionmentioning
confidence: 99%