2022
DOI: 10.3390/jmse10122018
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous Underwater Vehicle Path Planning Method of Soft Actor–Critic Based on Game Training

Abstract: This study aims to solve the issue of the safe navigation of autonomous underwater vehicles (AUVs) in an unknown underwater environment. AUV will encounter canyons, rocks, reefs, fish, and underwater vehicles that threaten its safety during underwater navigation. A game-based soft actor–critic (GSAC) path planning method is proposed in this study to improve the adaptive capability of autonomous planning and the reliability of obstacle avoidance in the unknown underwater environment. Considering the influence o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 42 publications
0
6
0
Order By: Relevance
“…SAC adopts the actor-critic framework. The actor collects data by interacting with the environment [31], while the critic's value function directs the actor in learning a more efficient strategy by using a policy-based gradient optimizer [32]. The critic learns a value function to measure the quality of state-action pairs from the data collected by the actor, and then helps the actor to update the policy.…”
Section: Algorithmic Framework Based On Sacmentioning
confidence: 99%
“…SAC adopts the actor-critic framework. The actor collects data by interacting with the environment [31], while the critic's value function directs the actor in learning a more efficient strategy by using a policy-based gradient optimizer [32]. The critic learns a value function to measure the quality of state-action pairs from the data collected by the actor, and then helps the actor to update the policy.…”
Section: Algorithmic Framework Based On Sacmentioning
confidence: 99%
“…Four algorithms based on DDPG, EDDPG, SAC, and ENSAC were applied to verify the effectiveness of MOP observation path planning. The hyperparameters for algorithmic models were initially based on references, i.e., based on the hyperparameters of the SAC algorithm primarily referenced [57][58][59], while the hyperparameters of the ENN network mainly referenced [43,60]. Moreover, the learning rate controls the size of each parameter update.…”
Section: Parameter Settingmentioning
confidence: 99%
“…If no obstacle is detected in a particular zone, then δ i z t and d i z s t have constant values. Therefore, the detection result χ of the forward-looking sonar can be described as shown below [28]:…”
Section: Hunting Environment Descriptionmentioning
confidence: 99%
“…The deep neural networks were constructed using Pytorch, and the network models were trained using GPU. The hyperparameters for S4AC were initially based on references, i.e., based on the hyperparameters of the SAC algorithm primarily referenced our previous work [28], while the hyperparameters of the GAN network mainly referenced [45]. Moreover, the learning rate was slightly adjusted from 0.0001 to 0.0003 to expedite the training process.…”
Section: Experimental Environment and Training Parametersmentioning
confidence: 99%
See 1 more Smart Citation