Beining Han scite author profile

Beining Han

5Publications

31Citation Statements Received

100Citation Statements Given

How they've been cited

How they cite others

139

Affiliations

Central China Normal University

Publications

Order By: Most citations

Off-Policy Multi-Agent Decomposed Policy Gradients

Wang¹,

Han²,

Wang³

et al. 2020

Preprint

View full text Add to dashboard Cite

Recently, multi-agent policy gradient (MAPG) methods witness vigorous progress. However, there is a discrepancy between the performance of MAPG methods and state-of-the-art multi-agent value-based approaches. In this paper, we investigate the causes that hinder the performance of MAPG algorithms and present a multiagent decomposed policy gradient method (DOP). This method introduces the idea of value function decomposition into the multi-agent actor-critic framework. Based on this idea, DOP supports efficient off-policy learning and addresses the issue of centralized-decentralized mismatch and credit assignment in both discrete and continuous action spaces. We formally show that DOP critics have sufficient representational capability to guarantee convergence. In addition, empirical evaluations on the StarCraft II micromanagement benchmark and multi-agent particle environments demonstrate that our method significantly outperforms state-of-the-art value-based and policy-based multi-agent reinforcement learning algorithms. Demonstrative videos are available at https://sites.google.com/view/dop-mapg/.

show abstract

The relationship between health belief and sleep quality of Chinese college students: The mediating role of physical activity and moderating effect of mobile phone addiction

Gao

Han

et al. 2023

Front. Public Health

View full text Add to dashboard Cite

BackgroundPoor sleep quality has become a common health problem encountered by college students.MethodsHealth belief scale (HBS), physical activity rating scale (PARS-3), mobile phone addiction tendency scale (MPATS) and Pittsburgh sleep quality index (PSQI) were adopted to analyze the data collected from survey questionnaires, which were filled out by 1,019 college students (including 429 males and 590 females) from five comprehensive colleges and universities from March 2022 to April 2022. The data collected from survey questionnaires were analyzed using SPSS and its macro-program PROCESS.Results(1) Health belief, physical activity, mobile phone addiction and sleep quality are significantly associated with each other (P < 0.01); (2) physical activity plays a mediating role between health belief and sleep quality, and the mediating effects account for 14.77%; (3) mobile phone addiction can significantly moderate the effect size of health belief (β = 0.062, p < 0.05) and physical activity (β = 0.073, P < 0.05) on sleep quality, and significantly moderate the effect size of health belief on physical activity (β = −0.112, p < 0.001).ConclusionThe health belief of college students can significantly improve their sleep quality; college students’ health belief can not only improve their sleep quality directly, but also improve their sleep quality through physical activity; mobile phone addiction can significantly moderate the effect size of health belief on sleep quality, the effect size of health belief on physical activity, and the effect size of physical activity on sleep quality.

show abstract

Off-Policy Reinforcement Learning with Delayed Rewards

Han¹,

Ren²,

Wu³

et al. 2021

Preprint

View full text Add to dashboard Cite

We study deep reinforcement learning (RL) algorithms with delayed rewards. In many real-world tasks, instant rewards are often not readily accessible or even defined immediately after the agent performs actions. In this work, we first formally define the environment with delayed rewards and discuss the challenges raised due to the non-Markovian nature of such environments. Then, we introduce a general off-policy RL framework with a new Q-function formulation that can handle the delayed rewards with theoretical convergence guarantees. For practical tasks with high dimensional state spaces, we further introduce the HC-decomposition rule of the Q-function in our framework which naturally leads to an approximation scheme that helps boost the training efficiency and stability. We finally conduct extensive experiments to demonstrate the superior performance of our algorithms over the existing work and their variants.Preprint. Under review.

show abstract

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Han¹,

Zheng²,

Chan³

et al. 2021

Preprint

View full text Add to dashboard Cite

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

Wang¹,

Ren²,

Han³

et al. 2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Beining Han

Off-Policy Multi-Agent Decomposed Policy Gradients

The relationship between health belief and sleep quality of Chinese college students: The mediating role of physical activity and moderating effect of mobile phone addiction

Off-Policy Reinforcement Learning with Delayed Rewards

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

Contact Info

Product

Resources

About