WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Lan, Tian; Srinivasa, Sunil; Wang, Huan; Zheng, Stephan

doi:10.48550/arxiv.2108.13976

Cited by 2 publications

(2 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A few recent works, e.g., Brax [7], Isaac Gym [19], and WarpDrive [16], use accelerators like GPUs and TPUs for the environment engine. Due to the highly parallel nature of the accelerators, numerous environments can be executed simultaneously.…”

Section: Related Workmentioning

confidence: 99%

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Weng¹,

Lin²,

Huang³

et al. 2022

Preprint

View full text Add to dashboard Cite

There has been significant progress in developing reinforcement learning (RL) training systems. Past works such as IMPALA, Apex, Seed RL, Sample Factory, and others aim to improve the system's overall throughput. In this paper, we try to address a common bottleneck in the RL training system, i.e., parallel environment execution, which is often the slowest part of the whole system but receives little attention. With a curated design for paralleling RL environments, we have improved the RL environment simulation speed across different hardware setups, ranging from a laptop, and a modest workstation, to a high-end machine like NVIDIA DGX-A100. On a high-end machine, EnvPool achieves 1 million frames per second for the environment execution on Atari environments and 3 million frames per second on MuJoCo environments. When running on a laptop, the speed of EnvPool is 2.8 times of the Python subprocess. Moreover, great compatibility with existing RL training libraries has been demonstrated in the open-sourced community, including CleanRL, rl_games, DeepMind Acme, etc. Finally, EnvPool allows researchers to iterate their ideas at a much faster pace and has the great potential to become the de facto RL environment execution engine. Example runs show that it takes only 5 minutes to train Atari Pong and MuJoCo Ant, both on a laptop. EnvPool has already been open-sourced at https://github.com/sail-sg/envpool.

show abstract

Section: Related Workmentioning

confidence: 99%

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Weng¹,

Lin²,

Huang³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…MARL frameworks such as [10] and MAVA [15] are designed to enable easier and more efficient implementation of MARL algorithms. The former innovates by focusing on performance with the use of GPU and their parallelization power.…”

Section: Related Workmentioning

confidence: 99%

Phantom -- A RL-driven multi-agent framework to model complex systems

Ardon¹,

Vann²,

Garg³

et al. 2022

Preprint

View full text Add to dashboard Cite

Agent based modeling (ABM) is a computational approach to modeling complex systems by specifying the behavior of autonomous decision-making components or agents in the system and allowing the system dynamics to emerge from their interactions. Recent advances in the field of Multi-agent reinforcement learning (MARL) have made it feasible to learn the equilibrium of complex environments where multiple agents learn at the same time -opening up the possibility of building ABMs where agent behaviors are learned and system dynamics can be analyzed. However, most ABM frameworks are not RL-native, in that they do not offer concepts and interfaces that are compatible with the use of MARL to learn agent behaviors. In this paper, we introduce a new framework, Phantom, to bridge the gap between ABM and MARL. Phantom is an RL-driven framework for agent-based modeling of complex multiagent systems such as economic systems and markets. To enable this, the framework provides tools to specify the ABM in MARLcompatible terms -including features to encode dynamic partial observability, agent utility / reward functions, heterogeneity in agent preferences or types, and constraints on the order in which agents can act (e.g. Stackelberg games, or complex turn-taking environments). In this paper, we present these features, their design rationale and show how they were used to model and simulate Over-The-Counter (OTC) markets. CCS CONCEPTS• Software and its engineering → Application specific development environments; • Theory of computation → Multiagent reinforcement learning; Market equilibria.

show abstract

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

Cited by 2 publications

References 8 publications

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Phantom -- A RL-driven multi-agent framework to model complex systems

Contact Info

Product

Resources

About