Online learning of shaping rewards in reinforcement learning

Grze, Marek; Kudenko, Daniel

doi:10.1016/j.neunet.2010.01.001

Cited by 65 publications

(46 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In RL, it may refer either to training the agent on successive tasks of increasing complexity, until the desired complexity is reached [17,27,57,59,61,65], or, more commonly, to supplementing the MDP's reward function with additional, artificial rewards [3,14,15,24,38,49,50,53,85]. This article employs shaping functions in the latter sense.…”

Section: Shapingmentioning

confidence: 99%

“…In single-task RL, one approach is to construct an initial shaping function based on intuition [38] or an initial task model [24], and refine it through interaction with the task. Elfwing et al [15,16] evolve a shaping function that, when transferred to a real robot, results in better performance than when transferring Q-values.…”

Section: Potential-based Shapingmentioning

confidence: 99%

“…Elfwing et al [15,16] evolve a shaping function that, when transferred to a real robot, results in better performance than when transferring Q-values. Other work has learned a shaping function on abstractions that are either provided [26] or also learned [49]. This latter approach is related to ours in that it explores different representations for the potential and value function.…”

Section: Potential-based Shapingmentioning

confidence: 99%

See 2 more Smart Citations

Learning potential functions and their representations for multi-task reinforcement learning

Snel

Whiteson

2013

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

Section: Shapingmentioning

confidence: 99%

Section: Potential-based Shapingmentioning

confidence: 99%

Section: Potential-based Shapingmentioning

confidence: 99%

See 1 more Smart Citation

Learning potential functions and their representations for multi-task reinforcement learning

Snel

Whiteson

2013

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

“…If such a selection can be done, reward shaping can significantly improve the learning performance [6]. Improperly implemented, reward shaping, which is directly incorporated in the value update process, can also harm the convergence of a learning algorithm, modifying the optimum policy.…”

Section: Reward Shapingmentioning

confidence: 99%

“…(1) which in this work is defined as F : S × S → R, γ = 1. Based on prior work [6] we choose to set Φ(s) = V (s) shown to be an effective potential function. TiMRLA which normally learns the original models of the transition and reward functions of the source task, now learns the augmented reward function R , forming a training set of tuples < s, a > with their corresponding shaped rewards r = r + F (s, s ).…”

Section: Transferring Shaping Rewardsmentioning

confidence: 99%

Transferring task models in Reinforcement Learning agents

et al. 2013

View full text Add to dashboard Cite

The main objective of Transfer Learning is to reuse knowledge acquired in a previous learned task, in order to enhance the learning procedure in a new and more complex task. Transfer learning comprises a suitable solution for speeding up the learning procedure in Reinforcement Learning tasks. This work proposes a novel method for transferring models to Reinforcement Learning agents. The models of the transition and reward functions of a source task, will be transferred to a relevant but different target task. The learning algorithm of the target task's agent takes a hybrid approach, implementing both model-free and model-based learning, in order to fully exploit the presence of a source task model. Moreover, a novel method is proposed for transferring models of potential-based, reward shaping functions. The empirical evaluation, of the proposed approaches, demonstrated significant results and performance improvements in the 3D Mountain Car and Server Job Scheduling tasks, by successfully using the models generated from their corresponding source tasks.

show abstract

Differential Games

2014

Multi‐Agent Machine Learning

View full text Add to dashboard Cite

In the not too distant future, teams of robots will work together to accomplish a multitude of tasks. At the time of writing this book, we have seen the extensive use of aerial drones in surveillance, mapping, and other more unsavory tasks. We are also witnessing the beginning of truly autonomous vehicles for transportation. How long will it be before cars routinely drive themselves? We are currently on the verge of having multiple autonomous vehicles working together as some type of swarm. These groups of robots or autonomous vehicles will be a combination of aerial-, land-, and sea-based vehicles. These vehicles will have different configurations and capabilities. Unlike in the previous chapters, these vehicles will not be constrained to a grid, but, instead, they will be operating in a continuous and dynamically changing environment. The actions of these vehicles will be mathematically described by differential equations. The actions that the autonomous vehicles take will essentially and ultimately be control actions. These actions may be the setting of voltages on various actuators. We will refer to these types of systems as differential games (DGs).The goal of these types of agents is to learn how to work together and how to adapt to changes in their own or other robots' capabilities. For example, Multi-Agent Machine Learning: A Reinforcement Approach, First Edition. Howard M. Schwartz.

show abstract

Online learning of shaping rewards in reinforcement learning

Cited by 65 publications

References 20 publications

Learning potential functions and their representations for multi-task reinforcement learning

Learning potential functions and their representations for multi-task reinforcement learning

Transferring task models in Reinforcement Learning agents

Differential Games

Contact Info

Product

Resources

About