2022
DOI: 10.1007/s10489-022-04354-x
|View full text |Cite
|
Sign up to set email alerts
|

Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 40 publications
0
1
0
Order By: Relevance
“…Monte Carlo methods solve this by allowing an optimal policy to be found solely from experience with the environment [15] but it lacks sample efficiency. Model-based methods, such as derivative-based and derivative-free methods, offer better sample efficiency, in some cases and can deal with model uncertainty, especially with limited data using neural networks such as Bayesian Neural Networks [20] and Ensemble models [21]. Model-based methods are good when exploration is costly, as in controlling physical systems [13], but they are more computationally expensive [15].…”
Section: Related Work 21 Classic Reinforcement Learningmentioning
confidence: 99%
“…Monte Carlo methods solve this by allowing an optimal policy to be found solely from experience with the environment [15] but it lacks sample efficiency. Model-based methods, such as derivative-based and derivative-free methods, offer better sample efficiency, in some cases and can deal with model uncertainty, especially with limited data using neural networks such as Bayesian Neural Networks [20] and Ensemble models [21]. Model-based methods are good when exploration is costly, as in controlling physical systems [13], but they are more computationally expensive [15].…”
Section: Related Work 21 Classic Reinforcement Learningmentioning
confidence: 99%