Control of a Quadrotor With Reinforcement Learning

Hwangbo, Jemin; Sa, Inkyu; Siegwart, Roland; Hutter, Marco

doi:10.1109/lra.2017.2720851

Cited by 461 publications

(280 citation statements)

References 16 publications

Supporting

Mentioning

277

Contrasting

Unclassified

Order By: Relevance

“…This idea can be applied to other modules as well. Instead of a traditional proportional-integral-derivative (PID) control design, a network can map from a robot's state to motor commands (107). More ambitiously, some works have mapped directly from camera inputs to motor commands (108)(109)(110) and demonstrated navigation through indoor hallways and forests.…”

Section: Learningmentioning

confidence: 99%

Autonomous Flight

Tang

Kumar

2018

Annu. Rev. Control Robot. Auton. Syst.

View full text Add to dashboard Cite

This review surveys the current state of the art in the development of unmanned aerial vehicles, focusing on algorithms for quadrotors. Tremendous progress has been made across both industry and academia, and full vehicle autonomy is now well within reach. We begin by presenting recent successes in control, estimation, and trajectory planning that have enabled agile, highspeed flight using low-cost onboard sensors. We then examine new research trends in learning and multirobot systems and conclude with a discussion of open challenges and directions for future research. 6.1

show abstract

Section: Learningmentioning

confidence: 99%

Autonomous Flight

Tang

Kumar

2018

Annu. Rev. Control Robot. Auton. Syst.

View full text Add to dashboard Cite

show abstract

“…Moreover, a zero convergence proof of the control errors was included and it was validated through numerical simulations. In addition, other control approaches have been used for multicopters navigation such as predictive control, 3 sliding-mode control approach, 4 and NN trained using reinforcement learning techniques, 5 inverse dynamic, 6 and inverse kinematics considering energy consumption. 7 Therefore, the great challenge of how to effectively control a UAV to precisely track a desired trajectory is still subject to active research in UAV control.…”

Section: Motivationmentioning

confidence: 99%

Identification and adaptive PID Control of a hexacopter UAV based on neural networks

Rosales

Soria

Rossomando

2018

Adaptive Control & Signal

View full text Add to dashboard Cite

In this paper, a novel adaptive PID controller for trajectory-tracking tasks is proposed. It is implemented in discrete time over a hexacopter, and it takes into consideration the unmanned aerial vehicles (UAVs) nonlinear model. The PID controller is developed following an adaptive neural technique, and its stability is verified by the Lyapunov discrete theory. Besides, the neural identification of the dynamic model of the UAV is presented to backpropagate output errors to adjust PID gains with the purpose of reducing the control errors. The validation of the proposed algorithm is performed through experimental results with a hexacopter. KEYWORDSadaptive PID, discrete stability analysis, hexacopter, identification, neural networks 74 ;33:74-91. ROSALES ET AL. 75steady-state errors, and the derivative gain is used to modify any system overshoot. However, since quadrotor is a nonlinear underactuated systems, 8 it is not always possible to use PID control directly for the quadrotor system. Consequently, some authors proposed controllers designed based only on static PID for the UAV control such as the controllers presented in other works. [9][10][11] Other studies proposed the UAV control based on PID controllers with artificial intelligence and dynamic properties, such as the controllers presented in the work of Yang et al, 12 where authors defined a single-neuron PID, where the adjusting of the weights is a function of the control errors. Besides, simulations results were displayed to validate the proposal but the stability analysis is omitted. In addition, Xu and Zhou 13 use self-tuning PID controller based on adaptive pole placement to a quadrotor by using a linearized version of the dynamic model. Another class of PID controllers had the capabilities to backpropagate control errors for a better adjust of the controller gains, some of those are described in this work. Tsakalis and Dash 14 proposed an algorithm for  ∞ for online tuning of PID gains. The estimator is very complex to implement, although is robust with respect to the excitation spectrum. The algorithm is validated by using simulations results for different transfer functions. In the work of Liu et al, 15 the design of a Gaussian potential function network PID (GPFN-PID) controller based on GPFN NN was described. The proposed controller had the functions of online training, self-training, and self-adjusting, which made control of the UAV longitudinal channel more effective. However, the results obtained were based on simulations. Babu et al 16 proposed a gradient descent-based methodology for adjusting online the gains of a PID controller based on a cost function defined to penalize errors. The controller is designed only for positioning task and a commercial quadrotor is used to experiment the controller in position task. In the work of Wang et al, 17 an intelligent PID controller based on radial basis functions (RBFs) NN was designed for longitudinal attitude control of a small UAV using a nonlinear model. An RBF NN was utilized for online updatin...

show abstract

“…Sutton et al [7] have shown that RLC is direct adaptive optimal control, and that the system will converge to the optimal solution given infinite amount of trials. Hwangbo et al [11] have shown an example of training a controller for stabilizing a quad rotor using RLC. The RLC directly mapped the state of the quad rotor to actuator command, making any predefined control structure obsolete.…”

Section: Reinforcement Learning Controlmentioning

confidence: 99%