Intelligent Electric Water Heater Control with Varying State Information

Patyn, Christophe; Peirelinck, Thijs; Nowé, Ann

doi:10.1109/smartgridcomm.2018.8587453

Cited by 6 publications

(8 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The stratified buffer model we use has been presented and validated in the lab by Vanthournout et al [14]. This model has been used in DR research several times [1,16,17], and for more details we refer to Vanthournout et al [14].…”

Section: B Two Mass Buffer Modelmentioning

confidence: 99%

“…Demand response of an EWH gives rise to a Sequential Decision Problem (SDP). In earlier work [1,3,16,17] the learning agent's SDP has already been formulated as a Partially Observable Markov Decision Process (POMDP). This section reviews the POMDP and introduces the algorithms used in our work.…”

Section: Problem Formulation and Algorithmsmentioning

confidence: 99%

“…1) State-space: Patyn et al [16] investigated the influence of different observable state-space configurations on the performance of a RL agent applied to an EWH. Observing SoC results in the best performing control policy.…”

Section: A Markov Decision Processmentioning

confidence: 99%

“…Action u = 1 implies the EWH's heating element is turned on at rated power, u = 0 turns the heating element off. However, the EWH's internal controller maps the agent's action u to a physical action u phys according to (16), with T b the temperature at the backup sensor location. (16) While H(u) is unknown to the agent, it's effect can be observed through the cost-function.…”

Section: A Markov Decision Processmentioning

confidence: 99%

“…Due to its proven track record in DR applications we also consider FQI [1,2,16] in our experiments. In a discrete-time MDP with a one day horizon (T = 96) we can reformulate the SDP as a sequence of T control problems.…”

Section: Fitted Q-iterationmentioning

confidence: 99%

See 4 more Smart Citations

Domain Randomization for Demand Response of an Electric Water Heater

Peirelinck

Hermans

Spiessens

2021

IEEE Trans. Smart Grid

Self Cite

View full text Add to dashboard Cite

Thermostatically Controlled Loads (TCLs) provide a source of demand flexibility, and are often considered a good source for Demand Response (DR) applications. Due to their heterogeneity, and as such a lack of dynamics models, Reinforcement Learning (RL) is often used to exploit this flexibility. Unfortunately, RL requires exploratory interaction with the TCL, resulting in a period of potential discomfort for the users. We present an approach to reduce this exploratory time by pretraining the RL-agent. Domain randomization is used to facilitate knowledge transfer. We evaluate the pre-training potential in a DR energy arbitrage scenario with an Electric Water Heater (EWH). Our experiments show that a priori knowledge about EWH dynamics can be used to initialize and improve the control policy. In our experiments, pre-training attributes to 8.8 % additional cost savings, compared to starting from scratch.

show abstract