This article addresses the development and tuning of an energy management for a photovoltaic (PV) battery storage system for the cost-optimized use of PV energy using of reinforcement learning (RL). An energy management concept based on the Proximal Policy Optimization algorithm in combination with recurrent Long Short-Term Memory neural networks is developed for data-based policy learning. As a reference system for the simulation-based investigations, a PV battery storage system is modelled, parametrized and implemented with an interface for the RL algorithm. To demonstrate the generalization capability of the learned energy management, 98 training and 12 evaluation episodes, each with a length of one year, are generated from an empirical dataset of global radiation and load power time series. To improve the convergence speed and stability of the RL algorithm as well as the learned policy with regards to technoeconomic metrics, an extensive hyperparameter study is conducted by training 216 control policies with different hyperparameter configurations. Simulation-based benchmark tests of the learned energy management against conventional rule-based and model-predictive energy managements show that the RLbased concept can achieve slightly better results in terms of energy costs and the amount of energy fed into the grid than the commonly used model-predictive method.
This contribution introduces an energy management concept for multiuse applications of PV battery storage systems based on reinforcement learning (RL). The approach uses the state-of-the-art Proximal Policy Optimization algorithm in combination with recurrent Long Short-Term Memory networks to derive locally optimal energy management policies from a data-driven, simulation-based training procedure. For this purpose, an AC-coupled residential PV battery storage system is modelled and parametrized. Qualitative advantages of the RL-based approach compared to the commonly used model predictive control (MPC) approaches with regard to multi-use energy management applications, such as the ability to optimize a control policy over an infinite, discounted time horizon, are highlighted. From a large-scale training run of over 200 hyperparameter configurations, the five best energy management policies are selected and evaluated against stateof-the-art MPC and rule-based energy management concepts. In the evaluation over one year it is shown, that the energy management learned by the RL algorithm reduces curtailment losses from 5.70% to 4.78%, specific energy cost from 7.16 Cent kWh −1 to 7.09 Cent kWh −1 and increase the share of PV energy fed into the grid under a fixed feed-in limit from 49.95% to 50.99% compared to the MPC energy management, which is the second best one.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.