2019
DOI: 10.1007/s10015-019-00523-3
|View full text |Cite
|
Sign up to set email alerts
|

Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 14 publications
0
8
0
Order By: Relevance
“…Notably, overestimation bias is a widespread phenomenon and occurs in the valuebased methods and may be effectively addressed by utilizing multiple critics in the actor-critic framework [13]. In return, this significantly limits the algorithms' scalability through the increase in the number of gradient-based updates [14]. Furthermore, the efficient Reinforcement Learning methods' essential memory complexity tends to increase linearly based on the expressive power of approximators [15] significantly.…”
Section: Model-based Rl Approachesmentioning
confidence: 99%
See 3 more Smart Citations
“…Notably, overestimation bias is a widespread phenomenon and occurs in the valuebased methods and may be effectively addressed by utilizing multiple critics in the actor-critic framework [13]. In return, this significantly limits the algorithms' scalability through the increase in the number of gradient-based updates [14]. Furthermore, the efficient Reinforcement Learning methods' essential memory complexity tends to increase linearly based on the expressive power of approximators [15] significantly.…”
Section: Model-based Rl Approachesmentioning
confidence: 99%
“…On the flip side, model-based Reinforcement Learning approaches further learn a simplified model of the surrounding world [14]. The world model allows the agent to effectively predict the outcomes of various potential action sequences, which lets it play through the other hypothetical scenarios to come up with informed decisions in new situations, resulting in the reduction of the trial and error necessary for goal achievement [17].…”
Section: Model-based Rl Approachesmentioning
confidence: 99%
See 2 more Smart Citations
“…For example in the DST we might prefer a policy which alternates between the (-5, 3) and (-14, 50) returns, even though the SER for this approach is lower than for the policy which mixes the (-1,1) and (-19, 124) returns. While there is prior work on reducing variance within risk-aware single-objective RL [6,21] and also on MORL approaches to risk-aware RL [35,9], we are not aware of any previous work that addresses the issue of reducing the variance in returns within the context of MORL.…”
Section: Introductionmentioning
confidence: 99%