2020
DOI: 10.1287/moor.2019.0998
|View full text |Cite
|
Sign up to set email alerts
|

A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits

Abstract: The Whittle index, which characterizes optimal policies for controlling certain single restless bandit projects (a Markov decision process with two actions: active and passive) is the basis for a widely used heuristic index policy for the intractable restless multiarmed bandit problem. Yet two roadblocks need to be overcome to apply such a policy: the individual projects in the model at hand must be shown to be indexable, so that they possess a Whittle index; and the index must be evaluated. Such roadblocks ca… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(16 citation statements)
references
References 62 publications
0
16
0
Order By: Relevance
“…We next aim to establish that condition (ii) in Theorem 1(a) is satisfied by the present model, i.e., that F -policies, i.e., those with active sets S 0 ⊕ S 1 ∈ F that are defined by (35), suffice to solve the λ-price problem (33) for any price λ ∈ R. We will use the DP optimality equations that characterize the optimal value function V * (a − ,i) (λ) for problem (33), starting from each augmented state (a − , i) ∈ Y: thus, for each original state i ∈ X ,…”
Section: Proving That F -Policies Are Optimalmentioning
confidence: 99%
See 2 more Smart Citations
“…We next aim to establish that condition (ii) in Theorem 1(a) is satisfied by the present model, i.e., that F -policies, i.e., those with active sets S 0 ⊕ S 1 ∈ F that are defined by (35), suffice to solve the λ-price problem (33) for any price λ ∈ R. We will use the DP optimality equations that characterize the optimal value function V * (a − ,i) (λ) for problem (33), starting from each augmented state (a − , i) ∈ Y: thus, for each original state i ∈ X ,…”
Section: Proving That F -Policies Are Optimalmentioning
confidence: 99%
“…In that way, the MABP with switching costs is cast as a multi-armed restless bandit problem (MARBP) without them, which allows for the deployment of theoretical and algorithmic results on restless bandit indexation, as introduced in [29] by Whittle. Such a theory has been developed in [30][31][32][33] by the author. Additionally, see the survey [34].…”
Section: Approach Via Restless Bandit Reformulation Whittle Index Amentioning
confidence: 99%
See 1 more Smart Citation
“…Under indexability, the P λ -optimal active and passive sets S * ,1 (λ) and S * ,0 (λ) are characterized by an index attached to project states. Note that the definition below refers to the sign function sgn : R → {−1, 0, 1}, and follows the formulation of indexability in ( [25], Definition 1). Definition 1 (Indexability and Whittle index).…”
Section: Indexabilitymentioning
confidence: 99%
“…Typically, researchers use ad hoc analyses to prove indexability and calculate the Whittle index in particularly models. In contrast, the author has introduced and developed in [3,24] a methodology to establish indexability and compute the Whittle index for general finite-state restless bandits, extended to the semi-Markov denumerable-state case in [4] and to the continuous-state case in [25]. The effectiveness of such an approach, based on verification of so-called PCL-indexability conditions-as they are grounded on satisfaction by project performance metrics of partial conservation laws (PCLs)-has been demonstrated in diverse models.…”
Section: Introductionmentioning
confidence: 99%