Proceedings of the 27th ACM International Conference on Information and Knowledge Management 2018
DOI: 10.1145/3269206.3271686
|View full text |Cite
|
Sign up to set email alerts
|

Differentiable Unbiased Online Learning to Rank

Abstract: Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and exten… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
117
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 74 publications
(119 citation statements)
references
References 38 publications
1
117
1
Order By: Relevance
“…Furthermore, to our surprise we find that some properties asserted to pertain to CLTR or OLTR methods in previously published work appear to be lacking when tested. For instance, in contrast with previously published expectations [24] OLTR is not substantially faster at learning than CLTR, and while always assumed to be safe [36], CLTR may be detrimental to the user experience when deployed under high-levels of noise.…”
Section: Introductioncontrasting
confidence: 60%
See 4 more Smart Citations
“…Furthermore, to our surprise we find that some properties asserted to pertain to CLTR or OLTR methods in previously published work appear to be lacking when tested. For instance, in contrast with previously published expectations [24] OLTR is not substantially faster at learning than CLTR, and while always assumed to be safe [36], CLTR may be detrimental to the user experience when deployed under high-levels of noise.…”
Section: Introductioncontrasting
confidence: 60%
“…Online Learning to Rank (OLTR) [8,24,29,37] aims to learn by directly interacting with users. OLTR algorithms affect the data gathered during the learning process because they have control over what is displayed to users.…”
Section: Online Learning To Rankmentioning
confidence: 99%
See 3 more Smart Citations