2014
DOI: 10.1109/tcyb.2013.2253094
|View full text |Cite
|
Sign up to set email alerts
|

Heuristically-Accelerated Multiagent Reinforcement Learning

Abstract: This paper presents a novel class of algorithms, called Heuristically-Accelerated Multiagent Reinforcement Learning (HAMRL), which allows the use of heuristics to speed up well-known multiagent reinforcement learning (RL) algorithms such as the Minimax-Q. Such HAMRL algorithms are characterized by a heuristic function, which suggests the selection of particular actions over others. This function represents an initial action selection policy, which can be handcrafted, extracted from previous experience in disti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
58
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 69 publications
(59 citation statements)
references
References 17 publications
1
58
0
Order By: Relevance
“…The same effect happens when the heuristic is used only until a certain episode (Figure 7c). The results presented here corroborates the results presented in (Bianchi et al, 2014). This experiment was coded in C++, compiled with GNU g++, and exe-cuted on a Virtual Machine running Linux Ubuntu 14 LTS, virtualised using VM-Ware Player, and running on a MacPro running Mac OS X 10.6, 2,66GHz Intel Xeon processor and 12 Gb of RAM memory 3 .…”
Section: Experiments 1: Mountain Car Problemsupporting
confidence: 86%
See 1 more Smart Citation
“…The same effect happens when the heuristic is used only until a certain episode (Figure 7c). The results presented here corroborates the results presented in (Bianchi et al, 2014). This experiment was coded in C++, compiled with GNU g++, and exe-cuted on a Virtual Machine running Linux Ubuntu 14 LTS, virtualised using VM-Ware Player, and running on a MacPro running Mac OS X 10.6, 2,66GHz Intel Xeon processor and 12 Gb of RAM memory 3 .…”
Section: Experiments 1: Mountain Car Problemsupporting
confidence: 86%
“…Figure 7 shows that the results of the negative transfer when using L3-SARSA(λ) depends on the value of the η and ξ parameters and their decay (Figure 7a and b), and in the number of episodes that the heuristic is used ( Figure 7c). Bianchi et al (2014) showed that using a fixed value for η and ξ, the algorithm takes longer to ignore the negative transfer. Multiplying ξ by a decay value at the end of each episode reduces the influence of the heuristics over time.…”
Section: Experiments 1: Mountain Car Problemmentioning
confidence: 99%
“…Taylor and Stone [15] introduced behavior transfer, a novel approach to speeding up traditional RL. Celiberto, Matsuura et al [2] applied transfer learning from one agent to another agent by means of the heuristic function speeds up the convergence of the algorithm. Case-based is used to transfer the learning, and it makes TL-HAQL algorithm.…”
Section: Approach On Accelerated Multiagent Reinforcement Learningmentioning
confidence: 99%
“…In the particular case of multiagent systems, the reinforcement received by each agent depends both on the dynamics of the environment and on the behavior of other agents, and therefore a multiagent reinforcement learning (MRL) algorithm must address the resulting nonstationary scenarios in which both the environment and other agents are present. Unfortunately, convergence of any RL algorithm requires extensive exploration of the state-action space, which can be very time consuming [2], not to mention the existence of multiple agents also increases the size of the state-action space, therefore, worsening the performance of RL algorithms with respect to convergence (even to suboptimal control policies) when it is adapted to multiagent problems. Therefore acceleration of learning processes is one of important issues in reinforcement learning [3,4].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation