Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

Meng, Li; Yazidi, Anis; Goodwin, Morten; Engelstad, Paal E.

doi:10.7557/18.6237

Cited by 4 publications

(4 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is noteworthy that the aggregation of the key does not invariably require a linear output layer. A Minkowski sum of embeddings has been demonstrated as an alternative approach that can competently generate effective aggregated representations (Meng et al, 2023). Building on this insight, we have crafted an additional aggregation strategy that employs a straightforward averaging operation of K with H charts, shown in Eq.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise

et al. 2023

View full text Add to dashboard Cite

Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state, analogous to the definition of the exploration ratio in RL. Thus, the performance of Bootstrapped Deep Q-Learning Network is deeply connected with the level of diversity within the algorithm. In the original research, it was pointed out that a random prior could improve the performance of the model. In this article, we further explore the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm. We conduct our experiment on the Atari benchmark and compare our algorithm to both the original and other related algorithms. The results show that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games. Thus, we conclude that replacing priors with noise can improve Bootstrapped Deep Q-Learning's performance by ensuring the integrity of diversities.

show abstract

Section: Methodsmentioning

confidence: 99%

“…modules. Our approach is inspired by the unbalanced atlas (UA) paradigm (Korman, 2021;Meng et al, 2023) from the area of self-supervised learning (SSL). In our methodology, the key's weights have been uncoupled from the weights of the query and value, which are instead characterized as the charts of a manifold.…”

Section: Introductionmentioning

confidence: 99%

Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Another application utilizes the transformer model in RL as the replacement of convolutional layers for feature extraction. This is the case of the Swin Transformer model used for image processing [12]. It differs from the present paper, which does not incorporate the entire game screen as part of its input.…”

Section: Introductionmentioning

confidence: 92%

Playing Flappy Bird Based on Motion Recognition Using a Transformer Model and LIDAR Sensor

Dirgová Luptáková,

Kubovčík,

Pospíchal

2024

Sensors

View full text Add to dashboard Cite

A transformer neural network is employed in the present study to predict Q-values in a simulated environment using reinforcement learning techniques. The goal is to teach an agent to navigate and excel in the Flappy Bird game, which became a popular model for control in machine learning approaches. Unlike most top existing approaches that use the game’s rendered image as input, our main contribution lies in using sensory input from LIDAR, which is represented by the ray casting method. Specifically, we focus on understanding the temporal context of measurements from a ray casting perspective and optimizing potentially risky behavior by considering the degree of the approach to objects identified as obstacles. The agent learned to use the measurements from ray casting to avoid collisions with obstacles. Our model substantially outperforms related approaches. Going forward, we aim to apply this approach in real-world scenarios.

show abstract

“…Regarding interactive modeling concepts, Deep Reinforcement Learning (DRL) [18] offers a novel approach to knowledge inference. The Deeppath model, which is widely used, considers the entities of knowledge as the state spaces and navigates between them by selecting relations.…”

Section: Kg Reasoning On Neural Networkmentioning

confidence: 99%

A Schematic Review of Knowledge Reasoning Approaches Based on the Knowledge Graph

Vergara,

Lee

2023

JEBI

View full text Add to dashboard Cite

In the contemporary world, the Internet technology and its implementation mode are advancing at a swift pace, leading to an exponential growth in the scale of Internet data. This data contains a significant amount of valuable knowledge. The effective organization and articulation of knowledge, as well as the ability to conduct thorough calculations and analyses, have garnered significant attention and developments within a particular environmental context. The utilization of knowledge graphs for knowledge reasoning has emerged as a prominent area of focus within the realm of knowledge graph research. It holds substantial significance in the realm of vertical search, intelligent answering, and various other applications. This article will be centered on fundamental principles of reasoning. The approach of knowledge reasoning oriented towards knowledge graphs is focused on the derivation of novel knowledge or the detection of erroneous knowledge through the utilization of pre-existing knowledge. In contrast to conventional knowledge reasoning approaches, the knowledge reasoning technique employed in knowledge graphs is characterized by greater diversity, owing to the succinct, adaptable, and flexible representation of knowledge.

show abstract

Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

Cited by 4 publications

References 16 publications

Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise

Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise

Playing Flappy Bird Based on Motion Recognition Using a Transformer Model and LIDAR Sensor

A Schematic Review of Knowledge Reasoning Approaches Based on the Knowledge Graph

Contact Info

Product

Resources

About