The aim of the article is to approximate optimal relative control of an underactuated spacecraft using reinforcement learning and to study the influence of various factors on the quality of such a solution. In the course of this study, methods of theoretical mechanics, control theory, stability theory, machine learning, and computer modeling were used. The problem of in-plane spacecraft relative control using only control actions applied tangentially to the orbit is considered. This approach makes it possible to reduce the propellant consumption of reactive actuators and to simplify the architecture of the control system. However, in some cases, methods of the classical control theory do not allow one to obtain acceptable results. In this regard, the possibility of solving this problem by reinforcement learning methods has been investigated, which allows designers to find control algorithms close to optimal ones as a result of interactions of the control system with the plant using a reinforcement signal characterizing the quality of control actions. The well-known quadratic criterion is used as a reinforcement signal, which makes it possible to take into account both the accuracy requirements and the control costs. A search for control actions based on reinforcement learning is made using the policy iteration algorithm. This algorithm is implemented using the actor–critic architecture. Various representations of the actor for control law implementation and the critic for obtaining value function estimates using neural network approximators are considered. It is shown that the optimal control approximation accuracy depends on a number of features, namely, an appropriate structure of the approximators, the neural network parameter updating method, and the learning algorithm parameters. The investigated approach makes it possible to solve the considered class of control problems for controllers of different structures. Moreover, the approach allows the control system to refine its control algorithms during the spacecraft operation.
The subject of research is the process of creating a neural network model (NNM) for determining the force impact of an ion thruster (IT) plume on an orbital object during non-contact space debris removal. The work aims to develop NNMs and study the influence of various factors on the accuracy of determining the force transmitted by the ion plume of the thruster to a space debris object (SDO). The tasks to resolve are to choose the structures of the NNMs, form a data set and use this data to train and validate the NNMs, and to explore the influence of the model structure and optimizer parameters on the accuracy of force determination. The methods used are plasma physics, computer simulation, deep learning, and optimization using an improved version of stochastic gradient descent. As a result of research, three NNMs have been developed, which differ in the number of hidden layers and neurons in hidden layers. For training and validation of the NNMs, a data set was generated for an SDO approximated by a cylinder using an autosimilar description of the ion plasma propagation. The data set was obtained for various relative positions and orientations of the object in the process of its removal from an orbit. Using this data set, the NNM parameters were optimized with the supervised learning method. The optimizer and its parameters are selected, providing a small error at the stage of validating learning outcomes. It was found that the accuracy of determining the force depends on the relative position and orientation of the SDO, as well as the architecture of the NNM, and the features of this influence were identified. The approach applied allows us to obtain the possibility of using methods of deep learning to determine the force impact of the IT plume on the SDO. The proposed models provide the accuracy of the force impact determination, which is sufficient for solving the considered class of problems. At the same time, NNM makes it possible to obtain results much faster in comparison with the methods used previously. This fact makes the NNMs promising to use both on-board and in mathematical modeling of missions to remove space debris.
The advances in deep learning have revolutionized the field of artificial intelligence, demonstrating the ability to create autonomous systems with a high level of understanding of the environments where they operate. These advances, as well as new tasks and requirements in space exploration, have led to an increased interest in these deep learning methods among space scientists and practitioners. The goal of this review article is to analyze the latest advances in deep learning for navigation, guidance, and control problems in space. The problems of controlling the attitude and relative motion of spacecraft are considered for both traditional and new missions, such as orbital service. The results obtained using these methods for landing and hovering operations considering missions to the Moon, Mars, and asteroids are also analyzed. Both supervised and reinforcement learning are used to solve such problems based on various architectures of artificial neural networks, including convolutional and recurrent ones. The possibility of using deep learning together with methods of control theory is analyzed to solve the considered problems more efficiently. The difficulties that limit the application of the reviewed methods for space applications are highlighted. The necessary research directions for solving these problems are indicated.
The aim of this paper is to develop an effective algorithm for intelligent control of spacecraft based on reinforcement learning (RL) methods.In the development and analysis of the algorithm, methods of theoretical mechanics, automatic control and stability theories, machine learning, and computer simulation were used. To increase the RL efficiency, a statistical model of spacecraft dynamics based on the concept of Gaussian processes was used. On the one hand, such a model allows one to use a priori information about the plant and is sufficiently flexible, and on the other hand, it characterizes uncertainty in the dynamics in the form of confidence intervals and can be refined during the spacecraft operation. In this case, the problem of control/state space analysis reduces to obtaining such measurements that narrow the confidence intervals. The familiar quadratic criterion, which allows one to take into account both the accuracy requirements and the control cost, was used as the reinforcement signal. An RL-based search for control actions was made using a control law iterative algorithm. To implement the regulator and evaluate the cost function, neural network approximators were used. Spacecraft motion stability guarantees were obtained using the Lyapunov function method with account for the uncertainty in the spacecraft dynamics. The cost function was chosen as a candidate Lyapunov function, To simplify the stability test on the basis of this methodology, the dynamics of the plant was assumed to be Lipschitz continuous, which made it possible to use the Lagrange multiplier method for searching for control actions with account for the constraints formulated using the upper uncertainty bound and Lipschitz dynamics constants.The efficiency of the proposed algorithm is illustrated by computer simulation results. The approach makes it possible to develop control systems that can improve their performance as data are accumulated during the operation of a specific object, thus allowing one to reduce the requirements for its elements (sensors, actuators), do without special test equipment, and reduce the development time and cost.
The goal of this article is to develop an effective image preprocessing algorithm and a neural network model for determining the force to be transmitted to a space debris object (SDO) for its non-contact deorbit. In the development and study of the algorithm, use was made of methods of theoretical mechanics, machine learning, computer vision, and computer simulation. The force is determined using a photo taken by an onboard camera. To increase the efficiency of the neural network, an algorithm was developed for feature recognition by the SDO edge in the photo. The algorithm, on the one hand, selects a sufficient number of features to describe the properties of the figure and, on the other hand, significantly reduces the amount of data at the neural network input. A dataset with the features and corresponding reference force values was created for model training. A neural network model was developed to determine the force to be exerted on a SDO from the SDO features. The model was tested using a set of eighteen calculated cases to determine the effectiveness, accuracy, and speed of the algorithm. The proposed algorithm was compared with two existing ones: the method of central projections onto an auxiliary plane and the multilayered neural network model that calculates the force using the SDO orientation parameters. The comparison was performed using the root mean square error, the maximum absolute error, and the maximum relative error. The test results are presented as tables and graphs. The proposed approach makes it possible to develop a system of SDO non-contact removal that does not need to determine the exact relative position and orientation with respect to the active spacecraft. Instead, the algorithm uses camera-taken photos, from which the features necessary for calculation are extracted. This makes it possible to reduce the requirements for its computing elements, to abandon sensors for determining the relative position and orientation, and to reduce the cost of the system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.