Anis Yazidi scite author profile

Anis Yazidi

3Publications

4Citation Statements Received

22Citation Statements Given

How they've been cited

How they cite others

Affiliations

Metropolitan University

Publications

Order By: Most citations

Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

Meng

Yazidi

Goodwin

et al. 2022

nldl

View full text Add to dashboard Cite

In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims to incorporate semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.

show abstract

Modern Ai Versus Century-Old Mathematical Models: How Far Can We Go with Generative Adversarial Networks to Reproduce Stochastic Processes?

et al. 2022

View full text Add to dashboard Cite

Learning Automata with Artificial Reflecting Barriers in Games with Limited Information

Hassan

Oommen

Yazidi

2022

FLAIRS

View full text Add to dashboard Cite

This paper deals with the problem of solving stochastic games (which have numerous business and economic applications), using the interesting tools of Learning Automata (LA), the precursors to Reinforcement Learning (RL). Classical LA systems that possess properties of absorbing barriers, have been used as powerful tools in game theory to devise solutions that converge to the game's Nash equilibrium under limited information(Sastry, Phansalkar, and Thathachar 1994). Games with limited information are intrinsically hard because the player does not know the actions chosen of other players, neither their outcomes. The player might not be even aware of the fact that he/she is playing against an opponent. With the state-of-the-art, the numerous works in LA applicable for solving game theoretical problems, can merely solve the case where the game possesses a Saddle Point in a pure strategy. They are unable to reach mixed Nash equilibria when a Saddle Point is non-existent in pure strategies. Additionally, within the field of LA and RL in general, the theoretical and applied schemes of LA with artificial barriers are scarce, even though incorporating artificial barriers in LA has served as a powerful and yet under-explored concept, since its inception in the 1980’s. More recently, the phenomenon of introducing artificial non-absorbing barriers was pioneered, and this renders the LA schemes to be resilient to being trapped in absorbing barriers. In this paper, we devise a LA with artificial barriers for solving a general form of stochastic bimatrix games. The problem’s complexity has been augmented with the scenario that we consider games in which there is no Saddle Point. By resorting to the above-mentioned powerful concept of artificial reflecting barriers, we propose a LA that converges to an optimal mixed Nash equilibrium even though there may be no Saddle Point when a pure strategy is invoked.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anis Yazidi

Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

Modern Ai Versus Century-Old Mathematical Models: How Far Can We Go with Generative Adversarial Networks to Reproduce Stochastic Processes?

Learning Automata with Artificial Reflecting Barriers in Games with Limited Information

Contact Info

Product

Resources

About