Data-Driven Robust Multi-Agent Reinforcement Learning

Wang, Yudan; Wang, Yue; Zhou, Yi; Velasquez, Alvaro; Zou, Shaofeng

doi:10.1109/mlsp55214.2022.9943500

Cited by 6 publications

(9 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) The Monte Carlo method assumes that the value function of each state takes a value equal to the average of the returns G t of multiple episodes that must be executed in the termination state [100]. The value function of each state is the expectation of the payoff, and under the assumption of Monte Carlo reinforcement learning, the value function takes a value simplified from the expectation to the mean value.…”

Section: Model-free Reinforcement Learningmentioning

confidence: 99%

“…According to the characteristics and requirements of the problem, choosing a suitable centralized reinforcement learning method can improve the learning effect and decision quality of the agent. Common algorithms include Q-learning, DQNs (deep Q-networks) [127], policy gradient methods [128], proximal policy optimization, etc. Q-learning is a basic centralized reinforcement learning method to make optimal decisions by learning a value function.…”

Section: Concentrated Reinforcement Learningmentioning

confidence: 99%

See 1 more Smart Citation

How to Design Reinforcement Learning Methods for the Edge: An Integrated Approach toward Intelligent Decision Making

Wu,

Zhang,

Miao

et al. 2024

Electronics

View full text Add to dashboard Cite

Extensive research has been carried out on reinforcement learning methods. The core idea of reinforcement learning is to learn methods by means of trial and error, and it has been successfully applied to robotics, autonomous driving, gaming, healthcare, resource management, and other fields. However, when building reinforcement learning solutions at the edge, not only are there the challenges of data-hungry and insufficient computational resources but also there is the difficulty of a single reinforcement learning method to meet the requirements of the model in terms of efficiency, generalization, robustness, and so on. These solutions rely on expert knowledge for the design of edge-side integrated reinforcement learning methods, and they lack high-level system architecture design to support their wider generalization and application. Therefore, in this paper, instead of surveying reinforcement learning systems, we survey the most commonly used options for each part of the architecture from the point of view of integrated application. We present the characteristics of traditional reinforcement learning in several aspects and design a corresponding integration framework based on them. In this process, we show a complete primer on the design of reinforcement learning architectures while also demonstrating the flexibility of the various parts of the architecture to be adapted to the characteristics of different edge tasks. Overall, reinforcement learning has become an important tool in intelligent decision making, but it still faces many challenges in the practical application in edge computing. The aim of this paper is to provide researchers and practitioners with a new, integrated perspective to better understand and apply reinforcement learning in edge decision-making tasks.

show abstract

Section: Model-free Reinforcement Learningmentioning

confidence: 99%

Section: Concentrated Reinforcement Learningmentioning

confidence: 99%

How to Design Reinforcement Learning Methods for the Edge: An Integrated Approach toward Intelligent Decision Making

Wu,

Zhang,

Miao

et al. 2024

Electronics

View full text Add to dashboard Cite

show abstract

“…On regularizing the learning process, Kumar et al [20,22] introduce Q-learning and policy gradient methods for L p uncertainty sets, but do not experimentally evaluate their methods with experiments. Another type of uncertainty set considered in online robust RL is the R-contamination uncertainty, for which previous works have derived a robust Q-learning algorithm [40] and a regularized policy gradient algorithms [41]. R-contamination uncertainty assumes that the adversary can take the agent to any state, which is too conservative in practice.…”

Section: Related Workmentioning

confidence: 99%

“…Specifically, model-based methods that solve RMDPs [3,7,9,14,21,44] require access to the nominal transition probability, making it difficult to scale beyond tabular settings. While some recent works [21,22,42,43] introduce model-free methods that add regularization to the learning process, the effectiveness of their methods are not validated in high-dimensional environments. In addition, these methods are based on particular RL algorithms (e.g., policy gradient, Q learning), limiting their general applicability.…”

Section: Introductionmentioning

confidence: 99%

Destabilizing Attack and Robust Defense for Inverter-Based Microgrids by Adversarial Deep Reinforcement Learning

Wang¹,

Pal²

2023

Preprint

View full text Add to dashboard Cite

<p>The droop controllers of inverter based resources (IBRs) can be adjustable by grid operators to facilitate regulation services. Considering the increasing integration of IBRs in distribution systems, cyber-security is becoming a major concern. This paper investigates the data-driven destabilizing attack and robust defense strategy based on deep reinforcement learning for inverter-integrated distribution systems. Firstly, the full-order high-fidelity model and reduced-order small-signal model of typical multi-inverter systems are derived. Then the destabilizing attack on the droop control gains is analyzed, which reveals its impact to system stability. Finally, the attack and defense problems are formulated as Markov decision process (MDP) and adversarial MDP (AMDP). The problems are solved by twin delayed deep deterministic policy gradient (TD3) algorithm to find the least effort attack path of the system and obtain the corresponding robust defense strategy. The simulation studies are conducted in a multi-inverter microgrid system with 4 IBRs and IEEE 123-bus system with 10 IBRs to evaluate the proposed method.<br> </p>

show abstract

“…The PPO (Yang et al, 2018) algorithm is a reinforcement learning algorithm based on the strategy gradient method. It samples data through interaction with the environment and optimizes an “alternative” objective function using stochastic gradient ascent.…”

Section: Comparison Of Various Algorithmsmentioning

confidence: 99%

Autonomous driving planning and decision making based on game theory and reinforcement learning

et al. 2022

View full text Add to dashboard Cite

Autonomous driving technology is one of the important methods that avoid the hidden dangers of traffic safety. Although the existing autonomous driving technology can meet some needs of real traffic scenarios, the ability of multi-agents to effectively generate autonomous driving strategy in complex traffic environments remains to be improved. Aiming at this problem, the automatic drive model based on game theory and reinforcement learning is proposed by combining these two technologies and applying them in multi-agent cooperative driving, which enables multi-agents to carry out strategic reasoning with negotiation in traffic scenarios by extending the game description language, and puts forward the constrained multi-agents deep deterministic policies gradient algorithm. Finally, the related experiments are conducted through the autonomous driving simulation platform, and the experimental results show that the proposed model can effectively generate driving strategies for multi-agents in complex traffic environments, which verifies the validity and feasibility of the proposed model, and provides the general research basis for multi-agents autonomous driving.

show abstract

Data-Driven Robust Multi-Agent Reinforcement Learning

Cited by 6 publications

References 39 publications

How to Design Reinforcement Learning Methods for the Edge: An Integrated Approach toward Intelligent Decision Making

How to Design Reinforcement Learning Methods for the Edge: An Integrated Approach toward Intelligent Decision Making

Destabilizing Attack and Robust Defense for Inverter-Based Microgrids by Adversarial Deep Reinforcement Learning

Autonomous driving planning and decision making based on game theory and reinforcement learning

Contact Info

Product

Resources

About