2022
DOI: 10.1609/aaai.v36i6.20554
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Attrition in Recommender Systems with Departing Bandits

Abstract: Traditionally, when recommender systems are formalized as multi-armed bandits, the policy of the recommender system influences the rewards accrued, but not the length of interaction. However, in real-world systems, dissatisfied users may depart (and never come back). In this work, we propose a novel multi-armed bandit setup that captures such policy-dependent horizons. Our setup consists of a finite set of user types, and multiple arms with Bernoulli payoffs. Each (user type, arm) tuple corresponds to an (unkn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…From a broader perspective, our work relates to bandits with complex rewards schemes, i.e., abandonment elements [3,11,44], and non-stationary rewards [4,22,26,27,34,38]. Our work is also related to multi-stakeholder recommendation systems [5,6,9,30,39,45] and fairness in machine learning [12,29,43].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…From a broader perspective, our work relates to bandits with complex rewards schemes, i.e., abandonment elements [3,11,44], and non-stationary rewards [4,22,26,27,34,38]. Our work is also related to multi-stakeholder recommendation systems [5,6,9,30,39,45] and fairness in machine learning [12,29,43].…”
Section: Related Workmentioning
confidence: 99%
“…Notice that by the end of the exploration stage, all arms are still viable. To see this, recall that our assumption in Inequality (3) suggests that at some point in every phase, all arms a will be pulled at least max(δ a , γτ ) times, which is by definition greater than the exposure constraint δ a .…”
Section: Meta Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…Several papers have proposed multi-armed bandit models where surrogate outcomes encode actions' longterm impacts. These include bandit models where poor recommendations cause attrition [Ben-Porat et al, 2022, Bastani et al, 2022 and bandit models where objectives incorporate diversity/boredom considerations [Xie et al, 2022, Cao et al, 2020, Ma et al, 2016. Wu et al [2017] studies a variation on typical bandit model where actions impact whether a user will return to the system.…”
Section: Surrogate Outcomes and Proxy-metricsmentioning
confidence: 99%