The attitude estimation of a rigid body by magnetic, angular rate, and gravity (MARG) sensors is a research subject for a large variety of engineering applications. A standard solution for building up the observer is usually based on the Kalman filter and its different extensions for versatility and practical implementation. However, the performance of these observers has long suffered from the inaccurate process and measurement noise covariance matrices, which in turn entails tedious parameter turning procedures. To overcome the laborious noise covariance matrices regulation, we propose in this paper a Q-learning-based approach to autonomously adapt the values of process and measurement noise covariance matrices. The Qlearning method establishes a reinforcement learning mechanism that forces the noise covariance matrices pair with the least difference between predictions and measurements of output to be found in a predetermined candidate set of noise covariance matrices. The effectiveness of the Q-learning approach, applied to Extended Kalman filter-based attitude estimation, is validated through the Monte Carlo method that uses real flight data on an unmanned aerial vehicle.
The process and measurement noise covariance matrices significantly impact the Extended Kalman Filter (EKF) performance and are often hand-tuned in practice, which usually entails a tedious task. Q-learning, a wellknown method in reinforcement learning, has been applied recently to better adapt the noise covariance matrices for the EKF, thanks to its simplicity and capability in handling uncertain environments. Typically, some heuristics are involved in designing the Q-learning-based EKF (QLEKF), such as tuning grid size and covariance matrices values of each state, which inevitably degrades the estimation performance when the heuristics are not suitable. We propose a dynamic grid-based Qlearning EKF (DG-QLEKF) to overcome that drawback, which brings two novelties, an updated 系-greedy algorithm and a dynamic grid strategy. The proposed algorithm and strategy can thoroughly exploit arbitrary search scope and find appropriate values of noise covariance matrices. The effectiveness of DG-QLEKF, applied in navigation for attitude and bias estimation, is validated through the Monte Carlo method and real flight data from an unmanned aerial vehicle. The DG-QLEKF leads to much more improved state estimation than the QLEKF and traditional EKF.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2025 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.