2019
DOI: 10.1609/aiide.v15i1.5230
|View full text |Cite
|
Sign up to set email alerts
|

Macro Action Selection with Deep Reinforcement Learning in StarCraft

Abstract: StarCraft (SC) is one of the most popular and successful Real Time Strategy (RTS) games. In recent years, SC is also widely accepted as a challenging testbed for AI research because of its enormous state space, partially observed information, multi-agent collaboration, and so on. With the help of annual AIIDE and CIG competitions, a growing number of SC bots are proposed and continuously improved. However, a large gap remains between the top-level bot and the professional human player. One vital reason is that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(1 citation statement)
references
References 12 publications
0
1
0
Order By: Relevance
“…The Ape-x algorithm separates data collection from strategy learning, uses multiple parallel agents to collect experience data, shares a large experience data buffer, and sends it to learners for learning. The original Ape-X, which was based on DQN and Deep Deterministic Policy Gradient(DDPG), was utilized in the feedback flow separation control system [53], StarCraft games [54] and controlling vehicles for autonomous driving [55]. The following two characteristics are where this paper most clearly improves: we first connect the Ape-x with a distributed reinforcement learning framework for off-policy learning.…”
Section: Introductionmentioning
confidence: 99%
“…The Ape-x algorithm separates data collection from strategy learning, uses multiple parallel agents to collect experience data, shares a large experience data buffer, and sends it to learners for learning. The original Ape-X, which was based on DQN and Deep Deterministic Policy Gradient(DDPG), was utilized in the feedback flow separation control system [53], StarCraft games [54] and controlling vehicles for autonomous driving [55]. The following two characteristics are where this paper most clearly improves: we first connect the Ape-x with a distributed reinforcement learning framework for off-policy learning.…”
Section: Introductionmentioning
confidence: 99%