StarCraft II poses a grand challenge for reinforcement learning. The main difficulties include huge state space, varying action space, long horizon, etc. In this paper, we investigate a set of techniques of reinforcement learning for the full-length game of StarCraft II. We investigate a hierarchical approach, where the hierarchy involves two levels of abstraction. One is the macro-actions extracted from expert's demonstration trajectories, which can reduce the action space in an order of magnitude yet remains effective. The other is a two-layer hierarchical architecture, which is modular and easy to scale. We also investigate a curriculum transfer learning approach that trains the agent from the simplest opponent to harder ones. On a 64×64 map and using restrictive units, we train the agent on a single machine with 4 GPUs and 48 CPU threads. We achieve a winning rate of more than 99% against the difficulty level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat model, we can achieve over 93% winning rate against the most difficult non-cheating built-in AI (level-7) within days. We hope this study could shed some light on the future research of large-scale reinforcement learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.