Economic model predictive control (EMPC) is a promising methodology for optimal operation of dynamical processes that has been shown to improve process economics considerably. However, EMPC performance relies heavily on the accuracy of the process model used. As an alternative to model-based control strategies, reinforcement learning (RL) has been investigated as a model-free control methodology, but issues regarding its safety and stability remain an open research challenge. This work presents a novel framework for integrating EMPC and RL for online model parameter estimation of a class of nonlinear systems. In this framework, EMPC optimally operates the closed loop system while maintaining closed loop stability and recursive feasibility. At the same time, to optimize the process, the RL agent continuously compares the measured state of the process with the model's predictions (nominal states), and modifies model parameters accordingly. The major advantage of this framework is its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed with minimal modifications. The performance of the proposed framework is illustrated on a network of reactions with challenging dynamics and practical significance. This framework allows control, optimization, and model correction to be performed online and continuously, making autonomous reactor operation more attainable.
The goal of process control is to maintain a process at the desired operating conditions. Disturbances, measurement uncertainties, and high-order dynamics in complex and highly integrated chemical processes pose a challenging control problem. Even though advanced process controllers, such as Model Predictive Control (MPC), have been successfully implemented to solve hard control problems, they are difficult to develop, rely on a process model, and require high performance computers and continuous maintenance. Reinforcement learning presents an appealing option for such complex systems, but little work has been done to apply reinforcement learning in chemical reactions with practical significance, to discuss the structure of the RL agent, and to evaluate the performance against benchmark measures. This work (1) applies a state-of-the-art reinforcement learning algorithm (DDPG) to a network of reactions with challenging dynamics and practical significance. (2) Disturbances and measurement uncertainties have been simulated. In addition, (3) we defined an observation space that is based on the working concept of a PID controller, optimized the reward function to achieve the desired controller performance, and evaluated the performance of the RL controller in terms of setpoint tracking, disturbance rejection, and robustness to parameter uncertainties.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.