Proceeding of the 2001 IEEE International Symposium on Intelligent Control (ISIC '01) (Cat. No.01CH37206)
DOI: 10.1109/isic.2001.971476
|View full text |Cite
|
Sign up to set email alerts
|

Multi-agent Markov decision processes with limited agent communication

Abstract: A b s t r a c C A number of well known methods exist for solving Markov Decision Problems (MDP) involving a single decision-maker with or without model uncertainty. Recen tly, there has been great ilerest in the multi-agent version of the problem where there are multiple interacting decision makers. How ever, most of the suggested methods for multi-agent MDP's require complete knowledge concerning the state and action of all agents. This, in turn, results in a large communication overhead when the agerts are p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 5 publications
0
8
0
Order By: Relevance
“…We cast the joint optimization problem in (5) as a Multi-Agent Markov Decision Process (MMDP) [25] described as the tuple {S, A, P, R}, where S = S 1 × • • • × S N is a set of all possible states for all inXSs referred to as state space, A = A 1 × • • • × A N is the joint action space containing all possible actions (i.e., the set of all possible combinations of channels and power levels), R denotes the reward signal and P : S × A × S → ∆ is the transition function [25], where ∆ denotes the set of probability distributions over S.…”
Section: Resource Selection With Limited Informationmentioning
confidence: 99%
“…We cast the joint optimization problem in (5) as a Multi-Agent Markov Decision Process (MMDP) [25] described as the tuple {S, A, P, R}, where S = S 1 × • • • × S N is a set of all possible states for all inXSs referred to as state space, A = A 1 × • • • × A N is the joint action space containing all possible actions (i.e., the set of all possible combinations of channels and power levels), R denotes the reward signal and P : S × A × S → ∆ is the transition function [25], where ∆ denotes the set of probability distributions over S.…”
Section: Resource Selection With Limited Informationmentioning
confidence: 99%
“…Multi-agent MDP is a popular method for solving sequential optimization, decision making, and learning problems in an uncertain environment where the outcome depends on the previous actions (Mukhopadhyay & Jain, 2001). The presence of uncertainty regarding agent states and actions can lead to performance issues.…”
Section: Related Workmentioning
confidence: 99%
“…We cast the joint optimization problem in (5) as Multi-Agent Markov Decision Process (MMDP) [14] described as the tuple {S, A, P, R}, where S = S 1 × • • • × S N is a set of all possible states for all inXSs referred to as state space,…”
Section: Resource Selection With 1-bit Informationmentioning
confidence: 99%
“…the joint action space containing all possible actions (i.e., the set of all possible combinations of channels and power levels for problem I and all possible combinations of aggregated channels for problem II), R denotes the reward signal and P : S × A × S → ∆ is the transition function [14], where ∆ denotes the set of probability distributions over S.…”
Section: Resource Selection With 1-bit Informationmentioning
confidence: 99%