2017
DOI: 10.1371/journal.pone.0172395
|View full text |Cite
|
Sign up to set email alerts
|

Multiagent cooperation and competition with deep reinforcement learning

Abstract: Evolution of cooperation and competition can appear when multiple adaptive agents share a biological, social, or technological niche. In the present work we study how cooperation and competition emerge between autonomous agents that learn by reinforcement while using only their raw visual input as the state representation. In particular, we extend the Deep Q-Learning framework to multiagent environments to investigate the interaction between two learning agents in the well-known video game Pong. By manipulatin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
329
0
2

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 636 publications
(354 citation statements)
references
References 21 publications
0
329
0
2
Order By: Relevance
“…Visual Multi-Agent Reinforcement Learning: Multiagent systems result in non-stationary environments posing significant challenges. Multiple approaches have been proposed over the years to address such concerns [82,83,81,30]. Similarly, a variety of settings from multiple cooperative agents to multiple competitive ones have been investigated [51,65,57,11,63,35,56,29,61].…”
Section: Related Workmentioning
confidence: 99%
“…Visual Multi-Agent Reinforcement Learning: Multiagent systems result in non-stationary environments posing significant challenges. Multiple approaches have been proposed over the years to address such concerns [82,83,81,30]. Similarly, a variety of settings from multiple cooperative agents to multiple competitive ones have been investigated [51,65,57,11,63,35,56,29,61].…”
Section: Related Workmentioning
confidence: 99%
“…Each agent estimates its own optimal Q-function, Q * (s, a) = argmax π Q π (s, a), which satisfies the Bellman optimality equation Q * (s, a) = E[r + γmax a Q * (s , a )|s, a]. Under the assumption of full observability at each agent and fully decentralized control, Tampuu et al combined IQL with deep Q-network (DQN), and proposed that each agent trains its Q-function parameterized by a neural network θ i by minimizing the loss func-tion (Tampuu et al 2017)…”
Section: Independent Q-learningmentioning
confidence: 99%
“…Brafman and Tennenholtz introduces a model-based reinforcement learning algorithm R-Max to deal with stochastic games [5]. Such stochastic elements can notably increase the complexity in multi-agent systems and multi-agent tasks, where agents learn to cooperate and compete simultaneously [6] [10]. As other agents adapt and actively adjust their policies, the best policy for each agent would evolve dynamically, giving rise to non-stationarity [8] [9].…”
Section: Introductionmentioning
confidence: 99%