2014
DOI: 10.1016/j.engappai.2014.01.001
|View full text |Cite
|
Sign up to set email alerts
|

Modeling of route planning system based on Q value-based dynamic programming with multi-agent reinforcement learning algorithms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
30
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 80 publications
(30 citation statements)
references
References 36 publications
0
30
0
Order By: Relevance
“…In this context, their algorithm outperforms similar algorithms such as Q-learning, real-time adaptive learning, and fixed timing plans when considering average delay, number of stops and vehicular emissions. In addition, Zolfpour-Arokhlo et al [20] apply a multi-agent reinforcement learning algorithm for obtaining a route planning system. They consider environment features such as weather, traffic, road safety and fuel capacity.…”
Section: Learning Algorithms In Abssmentioning
confidence: 99%
“…In this context, their algorithm outperforms similar algorithms such as Q-learning, real-time adaptive learning, and fixed timing plans when considering average delay, number of stops and vehicular emissions. In addition, Zolfpour-Arokhlo et al [20] apply a multi-agent reinforcement learning algorithm for obtaining a route planning system. They consider environment features such as weather, traffic, road safety and fuel capacity.…”
Section: Learning Algorithms In Abssmentioning
confidence: 99%
“…Recently, Multi-Agent Systems (MAS) and Reinforcement Learning (RL) have been integrated and applied to the field of traffic management, such as traffic control [29]- [30] and route planning [31]- [32]. With the advantages of both MAS and RL, Multi-Agent Reinforcement Learning (MARL) was introduced for TA [33].…”
Section: Introductionmentioning
confidence: 99%
“…The decision-making architecture consists of three layers: (i) Global Route Planning (GRP), (ii) Local Path Planning (LPP), and (iii) Feedback Control (FC). The GRP planning layer assigns optimal waypoints using dynamic programming (DP) [14], [15]. The GRP must rely on cloud and stored database information to define and sequence waypoints beyond onboard sensor range (line-of-sight).…”
Section: Introductionmentioning
confidence: 99%