Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

Proper, Scott; Tadepalli, Prasad

doi:10.1007/11871842_74

Cited by 22 publications

(12 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The criteria of choosing a better joint action helps to reduce the computational cost of searching the joint actions, which is exponentially proportional to the agent number and to provide a better performance. We present in this section the hill climbing search algorithm (HCS) which is proposed in [10]. Then, we propose an enhancement of the Hill climbing search algorithm for an opltimal joint action selection (eHCS) algorithm that speeds up the action search.…”

Section: E Coordinated Multi-agent Rlmentioning

confidence: 99%

“…Unlike the distributed approach which consists on forwarding all information between all agents, that may be a time consuming. Plenty of centralized joint action selection algorithms exists in the literature such as hill climbing search [10], Stackelberg Q-Learning [11] etc. In our model we propose a modified version of hill climbing search algorithm which fit our drone actions selection problem.…”

Section: E Coordinated Multi-agent Rlmentioning

confidence: 99%

See 1 more Smart Citation

Drone-Assisted Cellular Networks: A Multi-Agent Reinforcement Learning Approach

Hammami

Afifi

Moungla

et al. 2019

ICC 2019 - 2019 IEEE International Conference on Communications (ICC)

View full text Add to dashboard Cite

Drone-cell technology is emerging as a solution to support and backup the cellular network architecture. cell-drones are flexible and provide a more dynamic solution for resource allocation in both scales: spatial and geographic. They allow to increase the bandwidth availability anytime and everywhere according the continuous rate demands. Their fast deployment provide network operators with a reliable solution to face sudden network overload or peak data demands during mass events, without interrupting services and guaranteeing better QoS for users. With these advantages, drone-cell network management is still a complex task. We propose in this paper, a multiagent reinforcement learning approach for dynamic drones-cells management. Our approach is based on an enhanced joint action selection. Results show that our model speed up network learning and provide better network performance.

show abstract

Section: E Coordinated Multi-agent Rlmentioning

confidence: 99%

Section: E Coordinated Multi-agent Rlmentioning

confidence: 99%

Drone-Assisted Cellular Networks: A Multi-Agent Reinforcement Learning Approach

Hammami

Afifi

Moungla

et al. 2019

ICC 2019 - 2019 IEEE International Conference on Communications (ICC)

View full text Add to dashboard Cite

show abstract

“…The numerous successful applications of reinforcement learning include (in no particular order) learning in games (e.g., Backgammon (Tesauro, 1994) and Go (Silver et al, 2007)), applications in networking (e.g., packet routing (Boyan and Littman, 1994), channel allocation (Singh and Bertsekas, 1997)), applications to operations research problems (e.g., targeted marketing (Abe et al, 2004), maintenance problems (Gosavi, 2004), job-shop scheduling (Zhang and Dietterich, 1995), elevator control (Crites and Barto, 1996), pricing (Rusmevichientong et al, 2006), vehicle routing (Proper and Tadepalli, 2006), inventory control (Chang et al, 2007), fleet management (Simão et al, 2009)), learning in robotics (e.g., controlling quadrupedales (Kohl and Stone, 2004), humanoid robots (Peters et al, 2003), or helicopters (Abbeel et al, 2007)), and applications to finance (e.g., option pricing (Tsitsiklis andVan Roy, 1999b, 2001;Yu and Bertsekas, 2007;Li et al, 2009) …”

Section: Applicationsmentioning

confidence: 99%

Algorithms for Reinforcement Learning

Szepesvári¹

2010

Synthesis Lectures on Artificial Intelligence and Machine Learn

624

406

View full text Add to dashboard Cite

“…ARL has been previously used in the context of Q-learning [20] to achieve a compact representation of the value function, see for example [21]. More recently, it has also been used in model-based RL [22], in relational RL [23] and in learning general games [24].…”

Section: Introductionmentioning

confidence: 99%

Robustness of optimal channel reservation using handover prediction in multiservice wireless networks

2012

View full text Add to dashboard Cite

, JM.; Pla, V. (2012). Robustness of optimal channel reservation using handover prediction in multiservice wireless networks. Wireless Networks. 18(6):621-633. doi:10.1007/s11276-012-0423-6. Wireless Networks manuscript No.(will be inserted by the editor) Robustness of optimal channel reservation using handover prediction in multiservice wireless networks Jorge Martinez-Bauset · Jose Manuel Gimenez-Guzman · Vicent Pla 2011Abstract The aim of our study is to obtain theoretical limits for the gain that can be expected when using handover prediction and to determine the sensitivity of the system performance against different parameters. We apply an averagereward reinforcement learning (RL) approach based on afterstates to the design of optimal admission control policies in mobile multimedia cellular networks where predictive information related to the occurrence of future handovers is available.We consider a type of predictor that labels active mobile terminals in the cell neighborhood a fixed amount of time before handovers are predicted to occur, which we call the anticipation time. The admission controller exploits this information to reserve resources efficiently. We show that there exists an optimum value for the anticipation time at which the highest performance gain is obtained. Although the optimum anticipation time depends on system parameters, we find that its value changes very little when the system parameters vary within a reasonable range. We also find that, in terms of system performance, deploying prediction is always advantageous when compared to a system without prediction, even when the system parameters are estimated with poor precision.

show abstract

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

Cited by 22 publications

References 8 publications

Drone-Assisted Cellular Networks: A Multi-Agent Reinforcement Learning Approach

Drone-Assisted Cellular Networks: A Multi-Agent Reinforcement Learning Approach

Algorithms for Reinforcement Learning

Robustness of optimal channel reservation using handover prediction in multiservice wireless networks

Contact Info

Product

Resources

About