2023
DOI: 10.3389/fnbot.2022.1105480
|View full text |Cite
|
Sign up to set email alerts
|

Research on reinforcement learning-based safe decision-making methodology for multiple unmanned aerial vehicles

Abstract: A system with multiple cooperating unmanned aerial vehicles (multi-UAVs) can use its advantages to accomplish complicated tasks. Recent developments in deep reinforcement learning (DRL) offer good prospects for decision-making for multi-UAV systems. However, the safety and training efficiencies of DRL still need to be improved before practical use. This study presents a transfer-safe soft actor-critic (TSSAC) for multi-UAV decision-making. Decision-making by each UAV is modeled with a constrained Markov decisi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…To address a cooperative search problem, Zhang et al used an improved particle swarm optimization algorithm to allocate the UAV reconnaissance area and maximize the utility of the UAV cluster [23].Of course, reinforcement learning is also a good option. Yue et al proposed a secure transfer soft-AC algorithm with security constraints for maximizing revenue [24]. Baek adopted distributed algorithms so UAV swarms could track ground targets and proposed an optimal sensor management technology and consensus-based decision algorithm to minimize the uncertainty of the target location [25].…”
Section: A Related Workmentioning
confidence: 99%
“…To address a cooperative search problem, Zhang et al used an improved particle swarm optimization algorithm to allocate the UAV reconnaissance area and maximize the utility of the UAV cluster [23].Of course, reinforcement learning is also a good option. Yue et al proposed a secure transfer soft-AC algorithm with security constraints for maximizing revenue [24]. Baek adopted distributed algorithms so UAV swarms could track ground targets and proposed an optimal sensor management technology and consensus-based decision algorithm to minimize the uncertainty of the target location [25].…”
Section: A Related Workmentioning
confidence: 99%
“…Moreover, deep reinforcement learning (DRL) (Mnih et al, 2015 ) combines DL and RL to implement end-to-end learning. It makes RL no longer limited to low-dimensional space and greatly expands the scope of application of RL (Wang C. et al, 2020 ; Chane-Sane et al, 2021 ; He L. et al, 2021 ; Kiran et al, 2021 ; Luo et al, 2021 ; Wu et al, 2021 ; Yan et al, 2022 ; Yue et al, 2023 ; Zhao et al, 2023 ). Wu et al ( 2021 ) introduced a curiosity-driven method into DRL to improve training efficiency and performance in autonomous driving tasks.…”
Section: Introductionmentioning
confidence: 99%