2020
DOI: 10.48550/arxiv.2008.06220
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Kernel Methods for Cooperative Multi-Agent Contextual Bandits

Abhimanyu Dubey,
Alex Pentland

Abstract: Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose COOP-KERNELUCB, an al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…In addition to the theoretical bounds on regret, experiments on both synthetic data and real data also verify the feasibility of the proposed gossiping approach of federated bandit. Future work may include extending this framework to contextual bandits [46] with local features or bandits with continuous arms [53].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition to the theoretical bounds on regret, experiments on both synthetic data and real data also verify the feasibility of the proposed gossiping approach of federated bandit. Future work may include extending this framework to contextual bandits [46] with local features or bandits with continuous arms [53].…”
Section: Discussionmentioning
confidence: 99%
“…A naive agent, which uses a standard centralized bandit algorithm, may not solve the problem without exchanging information with other agents. The heterogeneous reward structure is ready for extension to a contextual case [46] by considering the feature or local feature v [47,48] of each sample, where the regret could be R…”
Section: Problem Formulationmentioning
confidence: 99%
“…The idea of using kernel mean embeddings (KME) for adaptive domain generalization was proposed in the work of Blanchard et al [7]. Kernel mean embeddings have also been used for personal-ized learning in both multi-task [11] and multi-agent learning [13] bandit problems. A rigorous treatment of domainadaptive generalization in the context of KME approaches is provided in Deshmukh et al [12].…”
Section: Related Workmentioning
confidence: 99%
“…The contextual bandit problem, however, is a very interesting candidate for private methods, since the involved contexts and rewards both typically contain sensitive user information [38]. There is an increasing body of work on online learning and multi-armed bandits in cooperative settings [13,31,39], and private single-agent learning [41,38], but methods for private federated bandit learning are still elusive, despite their immediate applicability.…”
Section: Introductionmentioning
confidence: 99%