2021
DOI: 10.1613/jair.1.12270
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Multi-objective Reinforcement Learning via Multiple-gradient Descent with Iteratively Discovered Weight-Vector Sets

Abstract: Solving multi-objective optimization problems is important in various applications where users are interested in obtaining optimal policies subject to multiple (yet often conflicting) objectives. A typical approach to obtain the optimal policies is to first construct a loss function based on the scalarization of individual objectives and then derive optimal policies that minimize the scalarized loss function. Albeit simple and efficient, the typical approach provides no insights/mechanisms… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…Reward vectors. The existing methods for MORL (Roijers et al, 2014;Mossalam et al, 2016;Xu et al, 2020;Cao & Zhan, 2021) and succussor features (SF) (Barreto et al, 2017;Borsa et al, 2018;Hunt et al, 2019;Barreto et al, 2019;Zahavy et al, 2021) are related in terms of using reward vectors. In conventional SF settings, the agent optimizes its policy under condition that scalar rewards are given.…”
Section: Related Workmentioning
confidence: 99%
“…Reward vectors. The existing methods for MORL (Roijers et al, 2014;Mossalam et al, 2016;Xu et al, 2020;Cao & Zhan, 2021) and succussor features (SF) (Barreto et al, 2017;Borsa et al, 2018;Hunt et al, 2019;Barreto et al, 2019;Zahavy et al, 2021) are related in terms of using reward vectors. In conventional SF settings, the agent optimizes its policy under condition that scalar rewards are given.…”
Section: Related Workmentioning
confidence: 99%
“…Hu et al combined Lamarckian local search and deep multi-objective reinforcement learning to dispatch water valves and fire hydrants to isolate contaminated water and reduce residual concentrations of contaminants in water distribution networks (Hu et al , 2022). Cao and Zhan proposed a new and efficient gradient-based multi-objective reinforcement learning method, which aimed to iteratively reveal the relationship between objectives by finding the minimum norm point in the convex hull of multiple policy gradient sets (Cao and Zhan, 2021). Hu et al proposed the MO-MIX algorithm to solve the multi-objective multi-agent reinforcement learning problem based on the centralized training and decentralized execution framework (Hu et al , 2023).…”
Section: Introductionmentioning
confidence: 99%