2020
DOI: 10.1007/s10489-020-01702-7
|View full text |Cite
|
Sign up to set email alerts
|

Combining a gradient-based method and an evolution strategy for multi-objective reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 38 publications
0
20
0
Order By: Relevance
“…e goal of the reinforcement learning algorithm is to find an optimal strategy based on the Markov decision process to maximize the expected cumulative return. In this section, the driving distance of the electric vehicle, the total driving and charging time, and the charging economy are optimized in parallel to provide the electric vehicle owner with the best electric vehicle charging navigation scheduling strategy [21,22].…”
Section: Electric Vehicle Charging Navigation Scheduling Strategy Based On Reinforcement Learningmentioning
confidence: 99%
“…e goal of the reinforcement learning algorithm is to find an optimal strategy based on the Markov decision process to maximize the expected cumulative return. In this section, the driving distance of the electric vehicle, the total driving and charging time, and the charging economy are optimized in parallel to provide the electric vehicle owner with the best electric vehicle charging navigation scheduling strategy [21,22].…”
Section: Electric Vehicle Charging Navigation Scheduling Strategy Based On Reinforcement Learningmentioning
confidence: 99%
“…Thanks to this ability to fine tune the loss function according to the environment and agent history, EPG can learn faster than a standard RL agent. Diqi Chen and Gao [182] proposed a hybrid agent to approximate the Pareto frontier uniformly in a multi-objective decision-making problem. The authors argued that despite the fast convergence of DRL, it cannot guarantee a uniformly approximated Pareto frontier.…”
Section: Hybrid Deep Reinforcement Learning and Evolution Strategies Algorithmsmentioning
confidence: 99%
“…On the other hand, ES achieve a well-distributed Pareto frontier, but they face difficulties optimizing a DNN. Therefore, Diqi Chen and Gao [182] proposed a two-stage multi-objective reinforcement learning (MORL) framework. In the first stage, a multi-policy soft actor-critic algorithm learns multiple policies collaboratively.…”
Section: Hybrid Deep Reinforcement Learning and Evolution Strategies Algorithmsmentioning
confidence: 99%
“…Diqi Chen and Gao [172] proposed a hybrid agent to approximate the Pareto frontier uniformly in a multi-objective decision-making problem. The authors argued that despite the fast convergence of DRL, it cannot guarantee a uniformly approximated Pareto frontier.…”
Section: Hybrid Deep Reinforcement Learning and Evolution Strategies Algorithmsmentioning
confidence: 99%
“…On the other hand, ES achieve a well-distributed Pareto frontier, but they face difficulties optimizing a DNN. Therefore, Diqi Chen and Gao [172] proposed a two-stage multi-objective reinforcement learning (MORL) framework. In the first stage, a multi-policy soft actor-critic algorithm learns multiple policies collaboratively.…”
Section: Hybrid Deep Reinforcement Learning and Evolution Strategies Algorithmsmentioning
confidence: 99%