EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Weng, Jian; Lin, Min; Huang, Sheng‐Yi; Liu, Bo; Makoviichuk, Denys; Makoviychuk, Viktor; Liu, Zichen; Song, Yufan; Luo, Tao; Jiang, Yukun; Xu, Zhongwen; Yan, Shuicheng

doi:10.48550/arxiv.2206.10558

Cited by 2 publications

(3 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Existing experience replay systems such as RLlib (Liang et al, 2018), stable-baseline (Hill et al, 2018), rlpyt (Stooke & Abbeel, 2019), tianshou (Weng et al, 2022a), sample factory (Petrenko et al, 2020), and envpool (Weng et al, 2022b) have been predominantly designed for relatively smaller RL models and their implementations are confined to singleserver contexts. Consequently, they are not equipped to handle the distributed trajectory storage, selection, and collection necessary for training large RL models.…”

Section: Limitations Of Existing Systemsmentioning

confidence: 99%

“…Existing experience replay systems, unfortunately, fall short in fully addressing the aforementioned challenges. Most of these systems, such as RLlib (Liang et al, 2018), RL-Zoo (Ding et al, 2021), stable-baselines (Hill et al, 2018), rlpyt (Stooke & Abbeel, 2019), tianshou (Weng et al, 2022a), TorchOpt-RL (Liu et al, 2022;Ren et al, 2022), sample factory (Petrenko et al, 2020), and envpool (Weng et al, 2022b), are incorporated as part of single-server RL frameworks and fail to offer distributed trajectory storage, selection, and collection. The recent development in distributed experience replay systems, exemplified by Reverb (Cassirer et al, 2021), allows for storing trajectories on memory-optimized servers.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning ability of top university students in China: Shanghai Jiao Tong University as a case study

Wang

2022

SN Soc Sci

View full text Add to dashboard Cite

The network illustrates the interrelationships among diverse entities and has attracted considerable attention. A wide range of applications makes the network a popular modeling tool, including but not limited to social networks [1], business administration [2], and city construction [3]. In these fields, quantifying network value is essential. By examining network value, companies gain insights into consumer behavior, allowing them to refine marketing strategies accordingly [4]. Similarly, investigating network value can assist urban planners in better understanding the utilization and demands for public facilities, ultimately optimizing city construction [5]. To better understand network value, researchers have established some influential laws that describe network value in terms of the number of neighbors, edges, and subgraphs.Sarnoff 's Law. In the 1940s, David Sarnoff proposed that the value of a broadcasting network is directly proportional to the number of nodes (audience), which is also known as Sarnoff's law [6]. For example, in the case of TV programs, the network value increases linearly with the number of the audience, because the growing number of audiences allows advertisers to access more potential customers. This results in more advertisers, increased revenues, and higher broadcast media demand. Formally, for a network with n nodes, the communication value is Θ(n). This law was initially applied to the film industry, later extended to television, and usually represents one-way communication. Broadcasting can only transmit messages unidirectionally to users, but cannot spread information within users.Metcalfe's Law. With the conferral of the Turing Award on Robert Metcalfe in 2022, Metcalfe's law[7] has regained attention. Metcalfe's law was proposed on the background of the increasing number of Ethernet users and growing attention to the interconnection value of networks. Metcalfe argues that if the value of each node (terminal) is equal, the value of a network is proportional to the number of edges. One of the classic illustrations of Metcalfe's law lies in communication networks, where a network with n users can provide interconnection value proportional to approximately n 2 , i.e. Θ(n 2 ), as each user can communicate with the other n − 1 users in the network. Despite the many challenges associated with the development of the network, such as network scale, connection quality, and network design, Metcalfe's law remains applicable in some scenarios, especially in the field of cloud computing [8]. The increasing number of cloud computing users leads to the availability of more resources, which in turn attracts additional users, creating a virtuous cycle of growth. As a theory for describing the interconnection value of networks, Metcalfe's law is held in high regard and possesses significant implications for communication networks, the Internet, and social networks.Reed's Law. In 2000, David Reed proposed Reed's law [9] for group-forming networks (GFNs). He argued that network value depends not ...

show abstract

Section: Limitations Of Existing Systemsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Learning ability of top university students in China: Shanghai Jiao Tong University as a case study

Wang

2022

SN Soc Sci

View full text Add to dashboard Cite

show abstract

“…It also suffers from reduced usability (e.g., difficult to add diverse assets) and functionality (e.g., object contacts are inaccessible). EnvPool (Weng et al, 2022) batches environments by a thread pool to minimize synchronization and improve CPU utilization. Yet its environments need to be implemented in C++, which hinders fast prototyping (e.g., customizing observations and rewards).…”

Section: Introductionmentioning

confidence: 99%

ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

Gu¹,

Xiang²,

Li³

et al. 2023

Preprint

View full text Add to dashboard Cite

Generalizable manipulation skills, which can be composed to tackle longhorizon and complex daily chores, are one of the cornerstones of Embodied AI. However, existing benchmarks, mostly composed of a suite of simulatable environments, are insufficient to push cutting-edge research works because they lack object-level topological and geometric variations, are not based on fully dynamic simulation, or are short of native support for multiple types of manipulation tasks. To this end, we present ManiSkill2, the next generation of the SAPIEN ManiSkill benchmark, to address critical pain points often encountered by researchers when using benchmarks for generalizable manipulation skills. ManiSkill2 includes 20 manipulation task families with 2000+ object models and 4M+ demonstration frames, which cover stationary/mobile-base, single/dualarm, and rigid/soft-body manipulation tasks with 2D/3D-input data simulated by fully dynamic engines. It defines a unified interface and evaluation protocol to support a wide range of algorithms (e.g., classic sense-plan-act, RL, IL), visual observations (point cloud, RGBD), and controllers (e.g., action type and parameterization). Moreover, it empowers fast visual input learning algorithms so that a CNN-based policy can collect samples at about 2000 FPS with 1 GPU and 16 processes on a regular workstation. It implements a render server infrastructure to allow sharing rendering resources across all environments, thereby significantly reducing memory usage. We open-source all codes of our benchmark (simulator, environments, and baselines) and host an online challenge open to interdisciplinary researchers.Figure 1: ManiSkill2 provides a unified, fast, and accessible system that encompasses well-curated manipulation tasks (e.g., stationary/mobile-base, single/dual-arm, rigid/soft-body).1 † and * indicate equal contribution (in alphabetical order). See Appendix H for contributions. 2 Project website: https://maniskill2.github.io/ 3 Codes:

show abstract

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Cited by 2 publications

References 18 publications

Learning ability of top university students in China: Shanghai Jiao Tong University as a case study

Learning ability of top university students in China: Shanghai Jiao Tong University as a case study

ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

Contact Info

Product

Resources

About