Chenjia Bai scite author profile

Chenjia Bai

5Publications

80Citation Statements Received

86Citation Statements Given

How they've been cited

How they cite others

103

Affiliations

Harbin Institute of Technology, Beijing Academy of Artificial Intelligence

Publications

Order By: Most citations

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Bai¹,

Wang²,

Yang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Offline Reinforcement Learning (RL) aims to learn policies from previously collected datasets without exploring the environment. Directly applying off-policy algorithms to offline RL usually fails due to the extrapolation error caused by the out-of-distribution (OOD) actions. Previous methods tackle such problems by penalizing the Q-values of OOD actions or constraining the trained policy to be close to the behavior policy. Nevertheless, such methods typically prevent the generalization of value functions beyond the offline data and also lack a precise characterization of OOD data. In this paper, we propose Pessimistic Bootstrapping for offline RL (PBRL), a purely uncertainty-driven offline algorithm without explicit policy constraints. Specifically, PBRL conducts uncertainty quantification via the disagreement of bootstrapped Q-functions, and performs pessimistic updates by penalizing the value function based on the estimated uncertainty. To tackle the extrapolating error, we further propose a novel OOD sampling method. We show that such OOD sampling and pessimistic bootstrapping yields a provable uncertainty quantifier in linear MDPs, thus providing the theoretical underpinning for PBRL. Extensive experiments on D4RL benchmark show that PBRL has better performance compared to the state-of-the-art algorithms.

show abstract

Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning

Bai¹,

Xiao

Zhu

et al. 2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

Hao

Yang

Tang

et al. 2024

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

Guided goal generation for hindsight multi-goal reinforcement learning

et al. 2019

View full text Add to dashboard Cite

Exploration in Deep Reinforcement Learning: A Comprehensive Survey

Yang¹,

Tang²,

Bai³

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant success across a wide range of domains, including game AI, autonomous vehicles, robotics, finance, healthcare, transportation and so on. However, DRL and deep MARL agents are widely known to be sample-inefficient and millions of interactions are usually needed even for relatively simple game settings, thus preventing the wide application and deployment in real-industry scenarios. One bottleneck challenge behind is the well-known exploration problem, i.e., how to efficiently explore the unknown environments and collect informative experiences that could benefit the policy learning most towards optimal ones.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chenjia Bai

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

Guided goal generation for hindsight multi-goal reinforcement learning

Exploration in Deep Reinforcement Learning: A Comprehensive Survey

Contact Info

Product

Resources

About