Bolin Gao scite author profile

In this paper, we propose a passivity-based methodology for analysis and design of reinforcement learning in multi-agent finite games. Starting from a known exponentiallydiscounted reinforcement learning scheme, we show that convergence to a Nash distribution can be shown in the class of games characterized by the monotonicity property of their (negative) payoff. We further exploit passivity to propose a class of higher-order schemes that preserve convergence properties, can improve the speed of convergence and can even converge in cases whereby their first-order counterpart fail to converge. We demonstrate these properties through numerical simulations for several representative games. arXiv:1808.04464v1 [math.OC]

show abstract

Continuous-Time Discounted Mirror Descent Dynamics in Monotone Concave Games

Gao

Pavel

2021

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

Optimal Design of On-Center Steering Force Characteristic Based on Correlations between Subjective and Objective Evaluations

Dang

Chen

Gao

et al. 2014

SAE Int. J. Passeng. Cars - Mech. Syst.

View full text Add to dashboard Cite

Taking a Look at Small-Scale Pedestrians and Occluded Pedestrians

Cao

Pang

Han

et al. 2020

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

Small-scale pedestrian detection and occluded pedestrian detection are two challenging tasks. However, most state-of-the-art methods merely handle one single task each time, thus giving rise to relatively poor performance when the two tasks, in practise, are required simultaneously. In this paper, it is found that small-scale pedestrian detection and occluded pedestrian detection actually have a common problem, i.e., inaccurate location problem. Therefore, solving this problem enables to improve the performance of both tasks. To this end, we pay more attention to the predicted bounding box with worse location precision and extract more contextual information around objects, where two modules (i.e., location bootstrap and semantic transition) are respectively proposed. The location bootstrap is used to re-weight the regression loss, where the loss of predicted bounding box far from the corresponding groundtruth is up-weighted and the loss of predicted bounding box near the corresponding ground-truth is down-weighted. Meanwhile, the semantic transition adds more contextual information and relieves the semantic inconsistency of skip-layer fusion. Since the location bootstrap is not used at the test stage and the semantic transition is light-weight, the proposed method does not add much extra computational costs during inference. Experiments on the challenging Citypersons and Caltech datasets show that the proposed method outperforms the state-of-the-art methods on the small-scale pedestrians and occluded pedestrians (e.g., 5.20% and 4.73% improvements on the Caltech).Index Terms-Small-scale pedestrians, occluded pedestrians, location bootstrap, and semantic transition. J. Cao and Y. Pang are with the

show abstract

On Passivity and Reinforcement Learning in Finite Games

Gao

Pavel

2018

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Bolin Gao

On Passivity, Reinforcement Learning, and Higher Order Learning in Multiagent Finite Games

Continuous-Time Discounted Mirror Descent Dynamics in Monotone Concave Games

Optimal Design of On-Center Steering Force Characteristic Based on Correlations between Subjective and Objective Evaluations

Taking a Look at Small-Scale Pedestrians and Occluded Pedestrians

On Passivity and Reinforcement Learning in Finite Games

Contact Info

Product

Resources

About