2023
DOI: 10.1109/tmc.2022.3146881
|View full text |Cite
|
Sign up to set email alerts
|

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 50 publications
(17 citation statements)
references
References 36 publications
0
17
0
Order By: Relevance
“…It applied trust region policy optimization (TRPO) [30] as the global and local planners to handle the control at different levels. In our previous work, we proposed the deep recurrent graph network (DRGN) [31], a novel method that is designed for navigation in a large-scale multi-agent system. It constructs inter-agent communication based on a graph attention network (GAT) [32] and applies GRU to recall the long-term historical information of agents.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It applied trust region policy optimization (TRPO) [30] as the global and local planners to handle the control at different levels. In our previous work, we proposed the deep recurrent graph network (DRGN) [31], a novel method that is designed for navigation in a large-scale multi-agent system. It constructs inter-agent communication based on a graph attention network (GAT) [32] and applies GRU to recall the long-term historical information of agents.…”
Section: Related Workmentioning
confidence: 99%
“…We use the Gumbel-Softmax reparameterization trick [42] in HAMA and MADDPG to make them trainable in discrete action spaces. DGN is based on our proposed algorithm [31], which applies a GAT layer for inter-agent communication. We train our method and each baseline for 100 K episodes and test them for 10 K episodes.…”
Section: Simulation 61 Set Upmentioning
confidence: 99%
“…• UAV-MBS (Ye et al 2021b) is a cooperative task with 20 UAVs served as mobile base stations to fly around a target region to provide communication services to the randomly distributed ground users.…”
Section: Environmentsmentioning
confidence: 99%
“…In this research, the control problem is complex since it needs to optimize four objectives at the same time: coverage ability, energy consumption, connectivity, and fairness. Therefore, the DDPG is a promising solution, and it can be used along with the designed utility of the game model to achieve more coverage, less energy consumption, and high fairness, while keeping the UAVs connected all the time [29][30][31]. It can also deal with complex state spaces and with time-varying environments, and it uses powerful deep neural networks (DNNs) to assist the UAV in making decisions and providing high-quality services for the UAV network.…”
Section: Introductionmentioning
confidence: 99%