2015
DOI: 10.1109/tsp.2015.2403288
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Multi-Agent Online Learning Based on Global Feedback

Abstract: In many types of multi-agent systems, distributed agents cooperate with each other to take actions with the goal of maximizing an overall system reward. However, in many of these systems, agents only receive a (perhaps noisy) global feedback about the realized overall reward rather than individualized feedback about the relative merit of their own actions with respect to the overall reward. If the contribution of an agent's actions to the overall reward is unknown a priori, it is crucial for the agents to util… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
6
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 23 publications
0
13
0
Order By: Relevance
“…For example, motivated by the applications in cognitive radio network, a line of research (e.g., [28,38,7]) studied the regret minimization problem where the radio channels are modeled by the arms and the rewards represent the utilization rates of radio channels which could be deeply discounted if an arm is simultaneously played by multiple agents and a collision occurs. Regret minimization algorithms were also designed for the distributed settings with an underlying communication network for the peer-topeer environments (e.g., [41,26,43]). In [6,12], the authors studied distributed regret minimization in the adversarial case.…”
mentioning
confidence: 99%
“…For example, motivated by the applications in cognitive radio network, a line of research (e.g., [28,38,7]) studied the regret minimization problem where the radio channels are modeled by the arms and the rewards represent the utilization rates of radio channels which could be deeply discounted if an arm is simultaneously played by multiple agents and a collision occurs. Regret minimization algorithms were also designed for the distributed settings with an underlying communication network for the peer-topeer environments (e.g., [41,26,43]). In [6,12], the authors studied distributed regret minimization in the adversarial case.…”
mentioning
confidence: 99%
“…In such scenarios, significant improvement is expected by enabling cooperative learning among the distributed learners [39]. The challenges in these scenarios are how to design efficient cooperative learning algorithms with low communication complexity [40] and, when the distributed learners are self-interested and have conflicting goals, how to incentivize them to participate in the cooperative learning process using, e.g. rating mechanisms [41] [42].…”
Section: Discussionmentioning
confidence: 99%
“…Xu and C. Tekin [18], we have projected scrupulously solemnize this issue and progress online learning algorithm that permit the agents to accommodatingly learn how to exploit the overall reward in the worldwide feedback situations without swapping any data among themselves we demonstrate that when the agents perceive the worldwide feedback without faults. The dispersed nature of the measured multi-agent scheme results in to functions loss associates with the case where agents can altercation data.…”
Section: Literature Surveymentioning
confidence: 98%