2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA) 2019
DOI: 10.1109/icmla.2019.00147
|View full text |Cite
|
Sign up to set email alerts
|

Discretionary Lane Change Decision Making using Reinforcement Learning with Model-Based Exploration

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 10 publications
0
6
0
Order By: Relevance
“…However, the exploration of the environment is not likely to exhaust all states though the action policy contains random actions or noise [44,45], which leads to the global optimum may being buried in the quagmire. In particular, the proposed communication scenarios and indoor layouts have high complexity, impelling the efficient and sufficient exploration to be a problem.…”
Section: Federated Learning Modelmentioning
confidence: 99%
“…However, the exploration of the environment is not likely to exhaust all states though the action policy contains random actions or noise [44,45], which leads to the global optimum may being buried in the quagmire. In particular, the proposed communication scenarios and indoor layouts have high complexity, impelling the efficient and sufficient exploration to be a problem.…”
Section: Federated Learning Modelmentioning
confidence: 99%
“…For evaluating the performance of trained meta agents, we compare the PEARL trained agent and the MAML trained agent with a fine-tune method based on the Trust Region Policy Optimization (TRPO) [35] method with safety check implemented from [36]. The fine-tune method will just keep updating the initial policy in a new environment given collected data.…”
Section: B Varying Social Behaviors From Egoistic To Altruisticmentioning
confidence: 99%
“…Therefore, we will also report the crash rate of the fine-tune agent, MAML agent, and the PEARL agent in Section VI. We will compare the crash rate with the benchmark policy trained in the original environment using the method designed in [36].…”
Section: B Varying Social Behaviors From Egoistic To Altruisticmentioning
confidence: 99%
“…Pan et al (2017) built a new framework on the basis of A3C to train a self-driving vehicle by interacting with a synthesized real environment [7]. Recently, Zhang et al (2019) employed RL with modelbased exploration for autonomous lane change decisionmaking on highways [8]. Duan et al (2020) employed RL with a hierarchical architecture for autonomous decisionmaking on highways [9] [10].…”
Section: Introductionmentioning
confidence: 99%