2019
DOI: 10.48550/arxiv.1906.10173
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Finite-Horizon Two-Armed Bandit Problem with Binary Responses: A Multidisciplinary Survey of the History, State of the Art, and Myths

Peter Jacko

Abstract: In this paper we consider the two-armed bandit problem, which often naturally appears per se or as a subproblem in some multi-armed generalizations, and serves as a starting point for introducing additional problem features. The consideration of binary responses is motivated by its widespread applicability and by being one of the most studied settings. We focus on the undiscounted finite-horizon objective, which is the most relevant in many applications. We make an attempt to unify the terminology as this is d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 45 publications
0
3
0
Order By: Relevance
“…For the interested reader, Rosenberger and Lachin (2016, Section 10.2) gives a brief summary of the history of both of these areas, and an overview of multi-arm bandit models is presented in the review paper of Villar et al (2015a). For a review of non-randomized algorithms for the two-arm bandit problem, see Jacko (2019).…”
Section: Rar Methodology Literaturementioning
confidence: 99%
“…For the interested reader, Rosenberger and Lachin (2016, Section 10.2) gives a brief summary of the history of both of these areas, and an overview of multi-arm bandit models is presented in the review paper of Villar et al (2015a). For a review of non-randomized algorithms for the two-arm bandit problem, see Jacko (2019).…”
Section: Rar Methodology Literaturementioning
confidence: 99%
“…There exists a wealth of published work on multi-armed bandits under a variety of assumptions [8], [14], [15], [18]- [20] with some work focusing exclusively on Bernoulli bandits [2]- [4], [6], [16], [21]- [23]. However, as stressed in [14], few articles have been published on any variation of bandits with delayed rewards.…”
Section: A Related Work and Motivationmentioning
confidence: 99%
“…This paper will use the clinical trial setting as our main focus. Jacko [30] illustrates the formulated terminology of MABPs in various disciplines, and we follow the terminology that is widely used in the machine learning and clinical trial communities. For instance, 'algorithm' and 'arms' in MABPs correspond to 'design' and 'treatments' in biostatistics (see Table 1 in [30]).…”
Section: Introductionmentioning
confidence: 99%