2015
DOI: 10.48550/arxiv.1507.04564
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Selecting the best system and multi-armed bandits

Abstract: Consider the problem of finding a population or a probability distribution amongst many with the largest mean when these means are unknown but population samples can be simulated or otherwise generated. Typically, by selecting largest sample mean population, it can be shown that false selection probability decays at an exponential rate. Lately, researchers have sought algorithms that guarantee that this probability is restricted to a small δ in order log(1/δ) computational time by estimating the associated lar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
4
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 18 publications
1
4
0
Order By: Relevance
“…The remainder of this section is devoted to the proof of Theorem 2. We note here that a similar impossibility result was proved by [25] for the pure exploration bandit problem in the fixed-confidence setting. Our proof technique is inspired their methodology and also relies crucially on the lower bounds in [24].…”
Section: Fundamental Performance Limits For Robust Algorithmssupporting
confidence: 76%
“…The remainder of this section is devoted to the proof of Theorem 2. We note here that a similar impossibility result was proved by [25] for the pure exploration bandit problem in the fixed-confidence setting. Our proof technique is inspired their methodology and also relies crucially on the lower bounds in [24].…”
Section: Fundamental Performance Limits For Robust Algorithmssupporting
confidence: 76%
“…First, we assume that the (rate) function J(α x , α y ,t) can be analytically optimized over t. In this case, the objective in ( 9) is simpler to estimate and optimize. Note that these instances are direct analogues in the SAA context of the ranking and selection (R&S) problems studied in [4,7], and [6]. To deal with these types of problems, we present Algorithm 1 below, that parallels [7] Algorithm 2.…”
Section: Sequential Optimizationmentioning
confidence: 99%
“…While normality is justified by batching and the Central Limit Theorem (CLT), the assumption typically fails when simulations are run under an estimate of θ c , especially if such an estimate is updated in an online fashion. In this paper, we build our procedures on a Sequential Elimination framework ( [11,12,13]), as it allows us to construct valid continuation regions even in the presence of IU. Here we use a production-inventory problem (see Section 5 for details) to illustrate how our procedure works.…”
Section: Fixed Confidence Formulationmentioning
confidence: 99%
“…Our first procedure is a direct extension of a Sequential Elimination framework proposed by [11,12], which is also discussed in [13] recently. This general paradigm has a simple structure and can be extended to handle IU.…”
Section: The Se-iu Proceduresmentioning
confidence: 99%
See 1 more Smart Citation