2020
DOI: 10.1109/tsg.2019.2962625
|View full text |Cite
|
Sign up to set email alerts
|

Safe Off-Policy Deep Reinforcement Learning Algorithm for Volt-VAR Control in Power Distribution Systems

Abstract: Volt-VAR control is critical to keeping distribution network voltages within allowable range, minimizing losses, and reducing wear and tear of voltage regulating devices. To deal with incomplete and inaccurate distribution network models, we propose a safe off-policy deep reinforcement learning algorithm to solve Volt-VAR control problems in a model-free manner. The Volt-VAR control problem is formulated as a constrained Markov decision process with discrete action space, and solved by our proposed constrained… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
98
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 225 publications
(115 citation statements)
references
References 34 publications
2
98
0
1
Order By: Relevance
“…This indicates that the fitting of the value function in the proposed method was more robust to the uncertainty of network training. This phenomenon is similar to that reported by Wang et al [33], where past experience was used to improve the actor-critic algorithm's parameter update direction. The result of removing off-policy re-weighting revealed that data from past interactions with the environment are also favorable for AMPI-based reinforcement learning.…”
Section: Data Efficiency Verification During Adaptability To Changes In Vehicle Modelsupporting
confidence: 87%
“…This indicates that the fitting of the value function in the proposed method was more robust to the uncertainty of network training. This phenomenon is similar to that reported by Wang et al [33], where past experience was used to improve the actor-critic algorithm's parameter update direction. The result of removing off-policy re-weighting revealed that data from past interactions with the environment are also favorable for AMPI-based reinforcement learning.…”
Section: Data Efficiency Verification During Adaptability To Changes In Vehicle Modelsupporting
confidence: 87%
“…Therefore, the learned strategy may not be feasible in practice. To solve this problem, [46] proposes a volt-var control strategy of distribution network based on safe off-policy DRL algorithm. The volt-var control problem is first modeled as a constrained MDP.…”
Section: A Optimization Of Smart Power and Energy Distribution Grid mentioning
confidence: 99%
“…The basic idea is to introduce some penalty terms corresponding to the security constraints, and minimize them in priority during the learning process. Reference [61] adopted this idea to consider charging constraints of electric vehicle batteries.Reference [62] optimized voltage and reactive power by a safe off-policy deep reinforcement learning algorithm to avoid voltage violations.…”
Section: Category 3 Surrogate Modelmentioning
confidence: 99%