2020
DOI: 10.1177/1729881420969081
|View full text |Cite
|
Sign up to set email alerts
|

Research on autonomous collision avoidance of merchant ship based on inverse reinforcement learning

Abstract: To learn the optimal collision avoidance policy of merchant ships controlled by human experts, a finite-state Markov decision process model for ship collision avoidance is proposed based on the analysis of collision avoidance mechanism, and an inverse reinforcement learning (IRL) method based on cross entropy and projection is proposed to obtain the optimal policy from expert’s demonstrations. Collision avoidance simulations in different ship encounters are conducted and the results show that the policy obtain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…In the study of reinforcement learning, Zhao et al [18] introduced a collision avoidance algorithm based on deep reinforcement learning and demonstrated its superiority over the PID algorithm by providing examples. Zheng et al [19] proposed an inverse reinforcement learning collision avoidance algorithm based on cross entropy and projection. Chun et al [20] presented a deep reinforcement learning-based PPO (Proximal Policy Optimization) algorithm and compared it with the A-star algorithm to prove its superiority.…”
Section: Recent Advancesmentioning
confidence: 99%
“…In the study of reinforcement learning, Zhao et al [18] introduced a collision avoidance algorithm based on deep reinforcement learning and demonstrated its superiority over the PID algorithm by providing examples. Zheng et al [19] proposed an inverse reinforcement learning collision avoidance algorithm based on cross entropy and projection. Chun et al [20] presented a deep reinforcement learning-based PPO (Proximal Policy Optimization) algorithm and compared it with the A-star algorithm to prove its superiority.…”
Section: Recent Advancesmentioning
confidence: 99%
“…As far as the authors' knowledge, few studies have been conducted to apply IRL for ship collision avoidance problems. Zheng et al [21] introduced the projection-based IRL method [22] and derived the policy imitating sample data which had been acquired from a ship maneuvering simulator. Here, it can be noted that the encounter situations were limited to simple cases (only one-on-one), and the results did not indicate why the system took its actions.…”
Section: Introductionmentioning
confidence: 99%
“…As far as the authors' knowledge, few studies have been conducted to apply IRL for ship collision avoidance problems. Zheng et al 21) introduced the projectionbased IRL method 22) and derived the policy imitating sample data which had been acquired from a ship maneuvering simulator. Here, it can be noted that the encounter situations were limited to simple cases (only one-on-one), and the results did not indicate why the system took its actions.…”
Section: Introductionmentioning
confidence: 99%