Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Vu, Van Tu; Huy, Tran Quang; Pham, Thanh Loc; Nam, Đào Phương

doi:10.1007/s12555-020-0809-7

Cited by 23 publications

(12 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Just last year, Banginwar [ 25 ] offered an initial proposal of such applications (still neglecting nonlinear coupling terms from the transport theorem), where follow–on efforts should incorporate the nonlinear coupling terms. Alternatively, a trajectory tracking control approach for an uncertain surface vessel using the new cascade structure of adaptive reinforcement learning algorithm and kinematic controller, feed-forward term was offered in [ 26 ], while an adaptive reinforcement learning optimal tracking control algorithm was presented in [ 27 , 28 ] for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances.…”

Section: Methodsmentioning

confidence: 99%

Bilinear Interpolation of Three–Dimensional Gain–Scheduled Autopilots

Koo,

Sands

2023

Sensors

View full text Add to dashboard Cite

Gain-scheduled autopilots have emerged as a dominant strategy to achieve adaptive control of coupled, non-linear engineering complexities, owing to an ability to adapt to changing operational conditions and uncertainties. This study focuses on utilizing bilinear interpolation of gain-scheduled autopilots, emphasizing enhanced system performance and robustness. Through a comprehensive investigation and comparative analysis using three disparate cases, advantages over conventional methods are revealed. Strengths and weaknesses of both simple and specialized variants (such as linear, and real-time gain-scheduling) are introduced. Three missile guidance case–studies utilize simulation time and miss distance figures of merit. Comparing the performance of bilinear interpolation and automatic instantiations to index–search, over comparable traveled distances, missile miss distances were improved 179% and 196% respectively with slightly improved computational burden.

show abstract

Section: Methodsmentioning

confidence: 99%

Bilinear Interpolation of Three–Dimensional Gain–Scheduled Autopilots

Koo,

Sands

2023

Sensors

View full text Add to dashboard Cite

show abstract

“…Importantly, we will establish the connection between the trajectory tracking control scheme and the RL-based SV control strategy, ensuring the convergence of learning weights in AC NNs. Before delving into determining the attraction region of the cascade formation control system using the two-layer structure (Figure 1), we will build upon several previous assumptions related to stability and tracking problems, 3,42,50 in addition to the following assumption: (19) for each SV agent is known and bounded by a known positive constant L, as described in the following inequality 0…”

Section: Stability Analysismentioning

confidence: 99%

“…Each SV agent adheres to the assumptions outlined in Assumptions 1 and 2 and meets the bounded conditions during the training process of neural networks, as presented in previous works. 3,50 Furthermore, the signal vectors in each SV agent satisfy the PE condition (39), given by…”

Section: Stability Analysismentioning

confidence: 99%

“…Theorem Consider a group of

N

surface vehicles described by Equations ( 1 ), forming a bidirectional connected graph. Each SV agent adheres to the assumptions outlined in Assumptions 1 and 2 and meets the bounded conditions during the training process of neural networks, as presented in previous works 3,50 . Furthermore, the signal vectors in each SV agent satisfy the PE condition ( 39 ), given by

{\psi}_i^{\prime }(t)=\frac{\omega_i}{\sqrt{1+\upsilon {\omega}_i^T\Gamma {\omega}_i}}

.…”

Section: Formation Control Strategy For Multiple Surface Vesselsmentioning

confidence: 99%

See 1 more Smart Citation

Formation control scheme with reinforcement learning strategy for a group of multiple surface vehicles

Nguyen,

Dang,

Pham

et al. 2023

Intl J Robust & Nonlinear

Self Cite

View full text Add to dashboard Cite

This article presents a comprehensive approach to integrate formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV agent. The proposed control framework comprises two core components: a high‐level displacement‐based formation controller and a low‐level reinforcement learning (RL)‐based optimal control strategy for individual SV agents. The high‐level formation control law, employing a modified gradient method, is introduced to guide the SVs in achieving desired formations. Meanwhile, the low‐level control structure, featuring time‐varying references, incorporates the RL algorithm by transforming the time‐varying closed agent system into an equivalent autonomous system. The application of Lyapunov's direct approach, along with the existence of the Bellman function, guarantees the stability and optimality of the proposed design. Through extensive numerical simulations, encompassing various comparisons and scenarios, this study demonstrates the efficacy of the novel formation control strategy for multiple SV agent systems, showcasing its potential for real‐world applications.

show abstract

“…Similarly, Ref. [28] addressed a tracking control problem for an uncertain SV using ARL based cascaded structure. Current research on the other hand also employs Actor-Critic network by employing DDPG and PPO.…”

Section: Relevant Studiesmentioning

confidence: 99%

Deep Reinforcement Learning for Integrated Non-Linear Control of Autonomous UAVs

et al. 2022

View full text Add to dashboard Cite

In this research, an intelligent control architecture for an experimental Unmanned Aerial Vehicle (UAV) bearing unconventional inverted V-tail design, is presented. To handle UAV’s inherent control complexities, while keeping them computationally acceptable, a variant of distinct Deep Reinforcement Learning (DRL) algorithm, namely Deep Deterministic Policy Gradient (DDPG) is proposed. Conventional DDPG algorithm after being modified in its learning architecture becomes capable of intelligently handling the continuous state and control space domains besides controlling the platform in its entire flight regime. Nonlinear simulations were then performed to analyze UAV performance under different environmental and launch conditions. The effectiveness of the proposed strategy is further demonstrated by comparing the results with the linear controller for the same UAV whose feedback loop gains are optimized by employing technique of optimal control theory. Results indicate the significance of the proposed control architecture and its inherent capability to adapt dynamically to the changing environment, thereby making it of significant utility to airborne UAV applications.

show abstract

Online Actor-critic Reinforcement Learning Control for Uncertain Surface Vessel Systems with External Disturbances

Cited by 23 publications

References 36 publications

Bilinear Interpolation of Three–Dimensional Gain–Scheduled Autopilots

Bilinear Interpolation of Three–Dimensional Gain–Scheduled Autopilots

Formation control scheme with reinforcement learning strategy for a group of multiple surface vehicles

Deep Reinforcement Learning for Integrated Non-Linear Control of Autonomous UAVs

Contact Info

Product

Resources

About