Artem Tsypin scite author profile

Artem Tsypin

2Publications

31Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Automating Control of Overestimation Bias for Reinforcement Learning

Kuznetsov

Grishin²,

Tsypin³

et al. 2022

Preprint

View full text Add to dashboard Cite

Majority of high-performing off-policy reinforcement learning algorithms use aggregated overestimation bias control techniques.However, most of them rely on a pre-defined bias correction policies that are either not flexible enough or require environment-specific tuning of hyperparameter.In this work, we present a data-driven approach for automatic bias control.We demonstrate its effectiveness on three algorithms: Truncated Quantile Critics, Weighted Delayed DDPG and Maxmin Q-learning. Our approach eliminates the need for an extensive hyperparameter search.We show that it leads to the significant reduction of the actual number of interactions while, in most cases, matching the performance of a resource demanding grid search method.While on average the reduction of the bias improves the performance, elimination of the aggregated bias does not always lead to the best performance. To the best of our knowledge, that is the first case where it is proven on complex environments which highlights the important pitfalls of overestimation control.

show abstract

Automating Control of Overestimation Bias for Reinforcement Learning

Kuznetsov¹,

Grishin²,

Tsypin³

et al. 2021

Preprint

View full text Add to dashboard Cite

Bias correction techniques are used by most of the high-performing methods for off-policy reinforcement learning. However, these techniques rely on a pre-defined bias correction policy that is either not flexible enough or requires environment-specific tuning of hyperparameters. In this work, we present a simple data-driven approach for guiding bias correction. We demonstrate its effectiveness on the Truncated Quantile Critics -a stateof-the-art continuous control algorithm. The proposed technique can adjust the bias correction across environments automatically. As a result, it eliminates the need for an extensive hyperparameter search, significantly reducing the actual number of interactions and computation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Artem Tsypin

Automating Control of Overestimation Bias for Reinforcement Learning

Automating Control of Overestimation Bias for Reinforcement Learning

Contact Info

Product

Resources

About