2022
DOI: 10.48550/arxiv.2207.08673
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Back to the Manifold: Recovering from Out-of-Distribution States

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 0 publications
0
1
0
Order By: Relevance
“…Additionally, acknowledging the presence of uncertainty in the deployment of RL-based recommender systems paves the way towards solutions that are robust or resilient to such uncertainty. For instance, Oosterhuis and de Rijke [2021] propose a criterion for fallback to a safer policy when out-of-distribution (although in a different context, i.e., counterfactual learning to rank), and Ghosh et al [2022]; Reichlin et al [2022] propose adaptive offline RL policies that are able to recover from stepping in uncertain states during deployment by branching back to supported states. We hope that future research in recommender systems will put stronger emphasis on these aspects and reduce the gap between offline and online performance.…”
Section: Uncertainty-aware Evaluationmentioning
confidence: 99%
“…Additionally, acknowledging the presence of uncertainty in the deployment of RL-based recommender systems paves the way towards solutions that are robust or resilient to such uncertainty. For instance, Oosterhuis and de Rijke [2021] propose a criterion for fallback to a safer policy when out-of-distribution (although in a different context, i.e., counterfactual learning to rank), and Ghosh et al [2022]; Reichlin et al [2022] propose adaptive offline RL policies that are able to recover from stepping in uncertain states during deployment by branching back to supported states. We hope that future research in recommender systems will put stronger emphasis on these aspects and reduce the gap between offline and online performance.…”
Section: Uncertainty-aware Evaluationmentioning
confidence: 99%