As batch reinforcement learning algorithms reach maturity and neural networks are used increasingly in reinforcement learning, a performance comparison of these models should be performed. This paper discusses the implementation of a heat pump agent in a demand response setting and its cost effectiveness when implemented with different neural network types. The agent maintains the interior air temperature of a building between pre-set temperature constraints, with four actions at its disposal. The agent is incentivized to shift loads in a day-ahead market in order to minimize daily electricity costs. The simulation considered a multilayer perceptron (MLP), a convolutional neural network (CNN) and a long short-term memory neural network (LSTM) to model the environment dynamics. All architectures outperform a trivial thermostat controller and shift loads successfully after 20-25 days. For this particular setup, there is no significant difference between the MLP and the LSTM, while they do outperform the CNN model. The MLP is preferred as it requires far less computation time.
The increasing share of renewable energy sources in the electricity grid results in a higher degree of uncertainty regarding electrical energy production. In response to this, flexibility of the demand has been proposed as part of the solution. An important source of flexibility available at the residential consumer side are thermostatically controlled loads (TCLs). In this paper the activation of this source of flexibility is achieved by applying batch reinforcement learning (BRL) to an electric water heater (EWH) in a Time of Use (ToU) setting. The cost performance of six BRL agents with six different state spaces is compared quantitatively. In every case, the BRL agent can successfully shift energy consumption within 20-25 days. The performance of an agent with access to multiple temperature sensors along the height of the EWH is comparable to the performance of an agent with access to only the highest temperature sensor. This indicates manufacturing costs related to sensors can be reduced while maintaining the same performance. Additionally, results show that the inclusion of a theoretical state of charge value in the state space increases performance by more than 8% compared to the performance of the other BRL agents. It is therefore argued that an estimation of the state of charge should be included in future work as it would increase cost performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.