2021
DOI: 10.48550/arxiv.2106.04207
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions

Junyan Liu,
Shuai Li,
Dapeng Li

Abstract: We study the problem of stochastic bandits with adversarial corruptions in the cooperative multi-agent setting, where V agents interact with a common K-armed bandit problem, and each pair of agents can communicate with each other to expedite the learning process. In the problem, the rewards are independently sampled from distributions across all agents and rounds, but they may be corrupted by an adversary. Our goal is to minimize both the overall regret and communication cost across all agents. We first show t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…Most of their paper involves a different communication model where the agents/clients collaborate via a central server; Section 6 studies a "peer-to-peer" model which is closer to ours but requires additional assumptions on the number of malicious neighbors. A different line of work considers the case where an adversary can corrupt the observed rewards (see, e.g., [11,12,25,26,29,33,39,40,43], and the references therein), which is distinct from the role that malicious agents play in our setting.…”
Section: Other Related Workmentioning
confidence: 99%
“…Most of their paper involves a different communication model where the agents/clients collaborate via a central server; Section 6 studies a "peer-to-peer" model which is closer to ours but requires additional assumptions on the number of malicious neighbors. A different line of work considers the case where an adversary can corrupt the observed rewards (see, e.g., [11,12,25,26,29,33,39,40,43], and the references therein), which is distinct from the role that malicious agents play in our setting.…”
Section: Other Related Workmentioning
confidence: 99%
“…IV). Other forms of attacks are also studied [28]- [31], including the "weak attack" model [32]- [34] where attacks are performed before observing actions. Note that the attackers in all these works have no desire to explore the environment, while the reward-teaching server has to actively learn the global model.…”
Section: Related Workmentioning
confidence: 99%