2020
DOI: 10.48550/arxiv.2009.13051
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Agent Environment Cycle Games

Abstract: Partially Observable Stochastic Games (POSGs), are the most general model of games used in Multi-Agent Reinforcement Learning (MARL), modeling actions and observations as happening sequentially for all agents. We introduce Agent Environment Cycle Games (AEC Games), a model of games based on sequential agent actions and observations. AEC Games can be thought of as sequential versions of POSGs, and we prove that they are equally powerful. We argue conceptually and through case studies that the AEC games model is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…The mathematical model that considers a sequential-instead of simultaneous-decisionmaking of the agents is the Agent Environment Cycle (AEC) game [39]. In [37], Terry et al prove that, for every POMG, an equivalent AEC game exists and vice versa.…”
Section: Mathematical Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…The mathematical model that considers a sequential-instead of simultaneous-decisionmaking of the agents is the Agent Environment Cycle (AEC) game [39]. In [37], Terry et al prove that, for every POMG, an equivalent AEC game exists and vice versa.…”
Section: Mathematical Preliminariesmentioning
confidence: 99%
“…It is to differentiate between two cases for status update from s t to s t+1 . For environment steps, (i = 0) applies that the next state s t+1 is random and occurs with the probability of the transition function P. Otherwise, for agent steps i > 0, a deterministic state transition according to the transition function T i takes place [39]. Afterwards, the next agent i with the probability of the next-agent function v(i |s t , i, a i t ) is chosen.…”
Section: Mathematical Preliminariesmentioning
confidence: 99%