Bandit Processes and Dynamic Allocation Indices

Gittins, John

doi:10.1111/j.2517-6161.1979.tb01068.x

Cited by 1,113 publications

(758 citation statements)

References 17 publications

Supporting

Mentioning

749

Contrasting

Unclassified

Order By: Relevance

“…Our presentation of bandit problems is quite sketchy, and we send the reader to specialized references such as (Gittins 1979(Gittins , 1989Whittle 1982;Berry and Fristedt 1985).…”

Section: Comparison Of Rewards and Strategies In A One-armed Bandit Pmentioning

confidence: 99%

“…This durable good problem differs from the non durable good problem (discussed previously), since in the former case information acquisition has a constant cost, while in the latter it has an endogenous cost given that examination means consumption. The solution of this search model, with repetitive consumption has been worked out in clinical trials and in economics; it is known as the armed bandit problem (see Gittins 1979Gittins , 1989Berry and Fristedt 1985).…”

mentioning

confidence: 99%

See 1 more Smart Citation

Risk aversion in expected intertemporal discounted utilities bandit problems

2008

View full text Add to dashboard Cite

show abstract

“…Our presentation of bandit problems is quite sketchy, and we send the reader to specialized references such as (Gittins 1979(Gittins , 1989Whittle 1982;Berry and Fristedt 1985).…”

Section: Comparison Of Rewards and Strategies In A One-armed Bandit Pmentioning

confidence: 99%

mentioning

confidence: 99%

Risk aversion in expected intertemporal discounted utilities bandit problems

2008

View full text Add to dashboard Cite

show abstract

“…A relaxed version of the restless bandit problem is introduced, and the problem can be solved optimally in polynomial time. Based on this relaxed version, a priority-index heuristic policy is proposed to reduce to the optimal Gittins index policy [30] in the special case of the multi-armed bandit problem. However, the Whittle's index heuristic can be only applied when the restless bandit problem satisfies a certain indexability property, which is hard to check [21,31].…”

Section: Solving the Restless Bandit Problemmentioning

confidence: 99%

Optimal channel access for TCP performance improvement in cognitive radio networks

Luo

et al. 2010

Wireless Netw

View full text Add to dashboard Cite

Cognitive radio (CR) is a promising technology to improve spectrum utilization. Most of previous work on CR networks concentrates on maximizing transmission rate in the physical layer. However, the end-to-end transmission control protocol (TCP) performance perceived by secondary users is also a very important factor in CR networks. In this paper, we propose a novel multi-channel access scheme in CR networks, where the channel access is based on the TCP throughput in the transport layer. Specifically, we formulate the channel access process in CR network as a restless bandit system. With this stochastic optimization formulation, the optimal channel access policy is indexable, meaning that the channels with highest indices should be selected to transmit TCP traffic. In addition, we exploit cross-layer design methodology to improve TCP throughput, where modulation and coding at the physical layer and frame size at the data-link layer are considered together with TCP throughput in the transport layer to improve TCP performance. Simulation results show the effectiveness of the proposed scheme.

show abstract

“…He introduces a priority policy that relies on an index which can be computed for each job based on the properties of a job, but not on other jobs. Gittins [12] showed that this priority index is a special case of his Gittins index [12,13]. Twenty years after Sevcik presented the priority policy, Weiss [33] formulated Sevcik's priority index again in terms of the Gittins index and provided a different proof of the optimality of the priority policy.…”

Section: Preemptionmentioning

confidence: 99%

Stochastic online scheduling

Vredeveld

2011

Comput Sci Res Dev

View full text Add to dashboard Cite

In this paper we consider a model for scheduling under uncertainty. In this model, we combine the main characteristics of online and stochastic scheduling in a simple and natural way. Jobs arrive in an online manner and as soon as a job becomes known, the scheduler only learns about the probability distribution of the processing time and not the actual processing time. This model is called the stochastic online scheduling (SOS) model. Both online scheduling and stochastic scheduling are special cases of this model. In this paper, we survey the results for the SOS model.

show abstract

Bandit Processes and Dynamic Allocation Indices

Cited by 1,113 publications

References 17 publications

Risk aversion in expected intertemporal discounted utilities bandit problems

Risk aversion in expected intertemporal discounted utilities bandit problems

Optimal channel access for TCP performance improvement in cognitive radio networks

Stochastic online scheduling

Contact Info

Product

Resources

About