2021
DOI: 10.1109/tsg.2020.3034827
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Agent Safe Policy Learning for Power Management of Networked Microgrids

Abstract: This paper presents a multi-agent constrained reinforcement learning (RL) policy gradient method for optimal power management of networked microgrids (MGs) in distribution systems. While conventional RL algorithms are black box decision models that could fail to satisfy grid operational constraints, our proposed RL technique is constrained by AC power flow equations and other operational limits. Accordingly, the training process employs the gradient information of the power management problem constraints to en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 89 publications
(42 citation statements)
references
References 29 publications
0
42
0
Order By: Relevance
“…The RES forecast is assumed to have a gauussian distribution II. During implementation of CPO, the transition to state s t+1 from s t under action a t is achieved by solving equations (2a)-(2c), (7a), and (10a) while respecting the constraints (3)-( 4), (15), and (17). The constraints on the variables in action a t are implemented via the constraint function C(s t , a t ).…”
Section: Simulation Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The RES forecast is assumed to have a gauussian distribution II. During implementation of CPO, the transition to state s t+1 from s t under action a t is achieved by solving equations (2a)-(2c), (7a), and (10a) while respecting the constraints (3)-( 4), (15), and (17). The constraints on the variables in action a t are implemented via the constraint function C(s t , a t ).…”
Section: Simulation Resultsmentioning
confidence: 99%
“…Reinforcement learning (RL) is concerned with determining which actions an agent, which interacts with its environment, should take such that the reward collected (as a function of actions taken) is maximized over a given time horizon. In the last decade, RL has been used in various power systems applications such as volt/var control [13], EV charging schedule determination [14], power management in networked MGs [15], and optimal control of ESS in MGs [16].…”
Section: Introductionmentioning
confidence: 99%
“…The topmost layer deals with the faults that stem from multiple subsystems, denoting a system-level incident that is out of the scope of subsystemlevel controllers. A model-free control framework [81] allows topology changes and promotes continuous operation of the system in faulty conditions, proving to be beneficial despite of lower performance in terms of dynamics and disturbances.…”
Section: Space Mgs Control Frameworkmentioning
confidence: 99%
“…Distributed optimization algorithms solve large-scale and data-intensive problems in a wide range of application areas such as communications [16][17][18][19], electricity grid [20,21], large-scale multiagent systems [22,23], smart grids, wireless sensor networks [24], and statistical learning. Zhang and Sahraei-Ardakani have developed a fully distributed DC optimal power flow method that incorporates flexible transmission and discussed the effect of communication limitations on the convergence properties [25,26].…”
Section: Distributed Optimizationmentioning
confidence: 99%