Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving

Wu, Guanlin; Fang, Wenqi; Wang, Ji; Ge, Pin; Cao, Jun; Yang, Ping; Gou, Peng

doi:10.1007/s10489-022-04354-x

Cited by 5 publications

(1 citation statement)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Monte Carlo methods solve this by allowing an optimal policy to be found solely from experience with the environment [15] but it lacks sample efficiency. Model-based methods, such as derivative-based and derivative-free methods, offer better sample efficiency, in some cases and can deal with model uncertainty, especially with limited data using neural networks such as Bayesian Neural Networks [20] and Ensemble models [21]. Model-based methods are good when exploration is costly, as in controlling physical systems [13], but they are more computationally expensive [15].…”

Section: Related Work 21 Classic Reinforcement Learningmentioning

confidence: 99%

Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations

Sullivan,

Longo

2023

MAKE

View full text Add to dashboard Cite

Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this issue is amplified with every advancement. While many seek to move from Experience Replay to A3C, the latter demands more resources. Despite efforts to improve Experience Replay selection strategies, there is a tendency to keep the capacity high. We investigate training a Deep Convolutional Q-learning agent across 20 Atari games intentionally reducing Experience Replay capacity from 1×106 to 5×102. We find that a reduction from 1×104 to 5×103 doesn’t significantly affect rewards, offering a practical path to resource-efficient DRL. To illuminate agent decisions and align them with game mechanics, we employ a novel method: visualizing Experience Replay via Deep SHAP Explainer. This approach fosters comprehension and transparent, interpretable explanations, though any capacity reduction must be cautious to avoid overfitting. Our study demonstrates the feasibility of reducing Experience Replay and advocates for transparent, interpretable decision explanations using the Deep SHAP Explainer to promote enhancing resource efficiency in Experience Replay.

show abstract

Section: Related Work 21 Classic Reinforcement Learningmentioning

confidence: 99%