2021
DOI: 10.1109/tac.2021.3053539
|View full text |Cite
|
Sign up to set email alerts
|

Risk-Averse Allocation Indices for Multiarmed Bandit Problem

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 23 publications
0
3
0
Order By: Relevance
“…Related Work: As a well-known model to capture the exploration-exploitation dilemma in decision-making, multiarmed bandits have attracted extensive attention (see [13] for a survey). A variety of generalized bandit problems have been investigated, where the situations with non-stationary reward functions [14], restless arms [15], satisficing reward objectives [16], heavy-tailed reward distributions [17], risk-averse decision-makers [18], and multiple players [19] are considered. Distributed algorithms have also been proposed to tackle bandit problems (e.g., see [20]- [27]).…”
Section: Introductionmentioning
confidence: 99%
“…Related Work: As a well-known model to capture the exploration-exploitation dilemma in decision-making, multiarmed bandits have attracted extensive attention (see [13] for a survey). A variety of generalized bandit problems have been investigated, where the situations with non-stationary reward functions [14], restless arms [15], satisficing reward objectives [16], heavy-tailed reward distributions [17], risk-averse decision-makers [18], and multiple players [19] are considered. Distributed algorithms have also been proposed to tackle bandit problems (e.g., see [20]- [27]).…”
Section: Introductionmentioning
confidence: 99%
“…As a well-known model to capture the exploration-exploitation dilemma in decisionmaking, multi-armed bandits have attracted extensive attention (see [13] for a survey). A variety of generalized bandit problems have been investigated, where the situations with non-stationary reward functions [14], restless arms [15], satisficing reward objectives [16], heavy-tailed reward distributions [17], risk-averse decision-makers [18], and multiple players [19] are considered. Recently, distributed algorithms have also been proposed to tackle bandit problems (e.g., see [20][21][22][23][24][25][26][27]).…”
Section: Introductionmentioning
confidence: 99%
“…The Upper Confidence Bound (UCB) algorithm and its variants have proven their strength in tackling multi-armed bandit problems (e.g., see [5], [6]). Various generalizations of the classical bandit problem have been studied, in which nonstationary reward functions [7], [8], restless arms [9], satisficing reward objectives [10], risk-averse decision-makers [11], heavy-tailed reward distributions [12], and multiple players [13] are considered. Recently, increasing attention has been also paid to tackling bandit problems in a distributed fashion (e.g., see [14]- [17]).…”
Section: Introductionmentioning
confidence: 99%