2021
DOI: 10.1109/access.2021.3085328
|View full text |Cite
|
Sign up to set email alerts
|

Low-Cost Multi-Agent Navigation via Reinforcement Learning With Multi-Fidelity Simulator

Abstract: In recent years, reinforcement learning (RL) has been widely used to solve multi-agent navigation tasks, and a high-fidelity level for the simulator is critical to narrow the gap between simulation and real-world tasks. However, high-fidelity simulators have high sampling costs and bottleneck the training model-free RL algorithms. Hence, we propose a Multi-Fidelity Simulator framework to train Multi-Agent Reinforcement Learning (MFS-MARL), reducing the total data cost with samples generated by a lowfidelity si… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…The multi-agent framework is adopted to solve the problem. Researchers in various fields have tried to extend the existing single-agent to multi-agent [24][25][26], such as Modular Q-Learning in which a single agent problem is divided into different subproblems, and each agent solves different subproblems, Ant Q-Learning of which all the agents share reward, and Nash Q-Learning which has greatly improved the efficiency of Q-Learning algorithms [27][28][29]. In this paper, the training process is completed by multi-agent parallel mode, and the optimal maintenance policy of the bridge is output by calculating the return of the whole structure.…”
Section: Introductionmentioning
confidence: 99%
“…The multi-agent framework is adopted to solve the problem. Researchers in various fields have tried to extend the existing single-agent to multi-agent [24][25][26], such as Modular Q-Learning in which a single agent problem is divided into different subproblems, and each agent solves different subproblems, Ant Q-Learning of which all the agents share reward, and Nash Q-Learning which has greatly improved the efficiency of Q-Learning algorithms [27][28][29]. In this paper, the training process is completed by multi-agent parallel mode, and the optimal maintenance policy of the bridge is output by calculating the return of the whole structure.…”
Section: Introductionmentioning
confidence: 99%