2020 IEEE Security and Privacy Workshops (SPW) 2020
DOI: 10.1109/spw50608.2020.00027
|View full text |Cite
|
Sign up to set email alerts
|

On the Robustness of Cooperative Multi-Agent Reinforcement Learning

Abstract: In cooperative multi-agent reinforcement learning (c-MARL), agents learn to cooperatively take actions as a team to maximize a total team reward. We analyze the robustness of c-MARL to adversaries capable of attacking one of the agents on a team. Through the ability to manipulate this agent's observations, the adversary seeks to decrease the total team reward.Attacking c-MARL is challenging for three reasons: first, it is difficult to estimate team rewards or how they are impacted by an agent mispredicting; se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
39
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(39 citation statements)
references
References 19 publications
0
39
0
Order By: Relevance
“…Adversarial attacks for DRL agents. Most of existing adversarial attacks on DRL agents are on single agent [Huang et al, 2017, Lin et al, 2017, Kos and Song, 2017, Weng et al, 2019 while there is only one work [Lin et al, 2020] that focuses on the c-MARL setting, which is the most relevant work to our problem setting on attacking multi-agent RL.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Adversarial attacks for DRL agents. Most of existing adversarial attacks on DRL agents are on single agent [Huang et al, 2017, Lin et al, 2017, Kos and Song, 2017, Weng et al, 2019 while there is only one work [Lin et al, 2020] that focuses on the c-MARL setting, which is the most relevant work to our problem setting on attacking multi-agent RL.…”
Section: Related Workmentioning
confidence: 99%
“…However, there are two major differences between our work and Lin et al [2020]: (1) their attack is only evaluated under the StarCraft Multi-Agent Challenge (SMAC) environment [Samvelyan et al, 2019] where the action spaces are discrete;…”
Section: Related Workmentioning
confidence: 99%
“…There is another work [9], which is focused on resilience in cooperative MAS and propose an Antagonist-Ratio Training Scheme (ARTS) by reformulating the original target MAS as a mixed cooperative-competitive game between a group of protagonists which represent agents of the target MAS and a group of antagonists which represent failures in the MAS. However, Lin [7] introduces a novel attack where the attacker first trains a policy network with reinforcement learning to find a wrong action it should encourage the victim agent to take. Then, the adversary uses targeted adversarial examples to force the victim to take this action.…”
Section: Related Workmentioning
confidence: 99%
“…Afterwards, we manipulate the victim's observations so that it will imitate the behavior induced by the deceptive policy, which leads the victim astray from the right trajectories. Different from maximizing the least preferred action's Q-values (Lin et al, 2020), our method generates adversarial perturbations by minimizing the KL-divergence between the deceptive policy and the victim policy using Projected Gradient Descent (PGD) (Madry et al, 2017).…”
Section: Introductionmentioning
confidence: 99%