2020 59th IEEE Conference on Decision and Control (CDC) 2020
DOI: 10.1109/cdc42340.2020.9303801
|View full text |Cite
|
Sign up to set email alerts
|

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(17 citation statements)
references
References 23 publications
1
16
0
Order By: Relevance
“…As an example, Omidshafiei et al (2017) propose a decentralized MARL algorithm that uses RNNs to improve the agents' observability. Mao et al (2020) use an RNN to first compress the agents' histories into embeddings that are posteriorly fed into deep Q-networks, helping to improve agents' observability. The commonly used paradigm of centralized training with decentralized execution also contributes to alleviating partial observability at train time (Oliehoek et al, 2011;Rashid et al, 2018;Foerster et al, 2016;.…”
Section: A1 Partial Observability In Marlmentioning
confidence: 99%
“…As an example, Omidshafiei et al (2017) propose a decentralized MARL algorithm that uses RNNs to improve the agents' observability. Mao et al (2020) use an RNN to first compress the agents' histories into embeddings that are posteriorly fed into deep Q-networks, helping to improve agents' observability. The commonly used paradigm of centralized training with decentralized execution also contributes to alleviating partial observability at train time (Oliehoek et al, 2011;Rashid et al, 2018;Foerster et al, 2016;.…”
Section: A1 Partial Observability In Marlmentioning
confidence: 99%
“…[78] designs a neural network architecture, IPOMDPnet, which extends QMDP-net planning algorithm [79] to MARL settings under POMDP. Besides, [80] intro-duces the concept of information state embedding to compress agents' histories and proposes an RNN model combining the state embedding. Their method, i.e., embed-then-learn pipeline, is universal since the embedding can be fed into any existing partially observable MARL algorithm as the black-box.…”
Section: Vertical Federated Reinforcement Learningmentioning
confidence: 99%
“…The quantization is done through the approximations as measured by Kullback-Leibler divergence (relative entropy) between probability density functions. Further recent studies include ( [75]) and ( [101]).…”
Section: Literature Reviewmentioning
confidence: 99%
“…( [101]) presents a notion of approximate information variable and studies near optimality of policies that satisfies the approximate information state property. In ( [75]), a similar problem is analyzed under a decentralized setup. Our explicit approximation results in this chapter will find applications in both of these studies.…”
Section: Literature Reviewmentioning
confidence: 99%