2018
DOI: 10.48550/arxiv.1809.06364
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Abstract: Many reinforcement-learning researchers treat the reward function as a part of the environment, meaning that the agent can only know the reward of a state if it encounters that state in a trial run. However, we argue that this is an unnecessary limitation and instead, the reward function should be provided to the learning algorithm. The advantage is that the algorithm can then use the reward function to check the reward for states that the agent hasn't even encountered yet. In addition, the algorithm can simul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(17 citation statements)
references
References 5 publications
0
17
0
Order By: Relevance
“…We use a generalized, intrinsically Multi-Objective RL strategy for stock and cryptocurrency trading. We implement this by considering extensions of Multi-Objective Deep Q-Learning RL algorithm with experience replay and target network stabilization given in [6], and deploying it on the Nifty50 stock index and BTCUSD trading pair.…”
Section: Our Contributionmentioning
confidence: 99%
See 2 more Smart Citations
“…We use a generalized, intrinsically Multi-Objective RL strategy for stock and cryptocurrency trading. We implement this by considering extensions of Multi-Objective Deep Q-Learning RL algorithm with experience replay and target network stabilization given in [6], and deploying it on the Nifty50 stock index and BTCUSD trading pair.…”
Section: Our Contributionmentioning
confidence: 99%
“…To the very best of our knowledge, ours is first application of Multi-Reward RL in the sense of [6] to financial data.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In the last couple of years, interest in Deep MORL has intensified, although primarily in single-agent settings (see e.g. [1,33,47,70,74,82,106,111,112]). Very recently, single-objective multi-agent RL has received considerable attention as well [30,32,39,55,97,81,109,130].…”
Section: Deep Multi-objective Multi-agent Decision Makingmentioning
confidence: 99%
“…For our MORL agent we implement a vanilla DQN as described by Mnih et al (2015), although our method is easily applicable to most RL algorithms. Recent advances in MTRL, such as the use of UVFAs for generalizing across goals, has become more common in multiple objective settings (Friedman & Fontaine, 2018;Abels et al, 2018). We also utilize UVFAs to generalize across As the agent gains experience, tuples of state, action, next state, terminal, and reward vector (s, a, s , t, r) are stored in a replay buffer for future training.…”
Section: Multi-objective Dqnmentioning
confidence: 99%