2022
DOI: 10.3390/sym14102134
|View full text |Cite
|
Sign up to set email alerts
|

Deep Q-Learning Network with Bayesian-Based Supervised Expert Learning

Abstract: Deep reinforcement learning (DRL) algorithms interact with the environment and have achieved considerable success in several decision-making problems. However, DRL requires a significant number of data before it can achieve adequate performance. Moreover, it might have limited applicability when DRL agents are able to learn in a real-world environment. Therefore, some algorithms combine DRL agents with supervised learning and leverage previous additional knowledge. Some have integrated a deep Q-learning networ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…To optimize the training network, the difference between the evaluated network and the target network is calculated as the following loss function [32]:…”
Section: E-dqn Training Of Dronesmentioning
confidence: 99%
“…To optimize the training network, the difference between the evaluated network and the target network is calculated as the following loss function [32]:…”
Section: E-dqn Training Of Dronesmentioning
confidence: 99%
“…The method set a threshold based on the maximum LR value, LR min and LR max as values smaller than the threshold, and as the median of the minimum values of all recorded measures of the relative change in the MI between epochs. In this study, we do not suggest a separate "LR Range Test", and if prior knowledge can be obtained by the study [43], the system designer defines LR min and LR max . The study [18] suggested that the components were known to fit the data a priori; therefore, we might remove them based on prior experience.…”
Section: Proposed Algorithmmentioning
confidence: 99%