On the role of tracking in stationary environments

Sutton, Richard S.; Koop, Anna; Silver, David

doi:10.1145/1273496.1273606

Cited by 45 publications

(35 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Even if this were possible, though, the fitness of the entire lifetime is the most important factor, and this usually depends on learning efficiency more than the asymptotic result. Sutton et al [48] make related observations about the limitations of asymptotic optimality.…”

Section: Relation To Other Researchmentioning

confidence: 99%

Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective

Singh

Lewis

Barto

et al. 2010

IEEE Trans. Auton. Mental Dev.

348

307

View full text Add to dashboard Cite

Abstract-There is great interest in building intrinsic motivation into artificial systems using the reinforcement learning framework. Yet, what intrinsic motivation may mean computationally, and how it may differ from extrinsic motivation, remains a murky and controversial subject. In this article, we adopt an evolutionary perspective and define a new optimal reward framework that captures the pressure to design good primary reward functions that lead to evolutionary success across environments. The results of two computational experiments show that optimal primary reward signals may yield both emergent intrinsic and extrinsic motivation. The evolutionary perspective and the associated optimal reward framework thus lead to the conclusion that there are no hard and fast features distinguishing intrinsic and extrinsic reward computationally. Rather, the directness of the relationship between rewarding behavior and evolutionary success varies along a continuum.

show abstract

Section: Relation To Other Researchmentioning

confidence: 99%

Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective

Singh

Lewis

Barto

et al. 2010

IEEE Trans. Auton. Mental Dev.

348

307

View full text Add to dashboard Cite

show abstract

“…Li's KIMEL algorithm transforms the nonlinear input data with a kernel into a high-dimensional but linear feature space where linear IDBD is applied. Sutton and Koop [15], [16] developed another nice nonlinear extension IDBD-nl of the original IDBD algorithm using the logistic sigmoid function. It was applied for learning the game Go.…”

Section: Related Workmentioning

confidence: 99%

“…For all other algorithms we use this sigmoid function. An exception is Koop's IDBD-nl [15], [16] being derived for the logistic sigmoid function, which we use in this case instead of tanh. …”

Section: Other Algorithmsmentioning

confidence: 99%

Online Adaptable Learning Rates for the Game Connect-4

Bagheri

Thill

Koch

et al. 2016

IEEE Trans. Comput. Intell. AI Games

View full text Add to dashboard Cite

Learning board games by self-play has a long tradition in computational intelligence for games. Based on Tesauro's seminal success with TD-Gammon in 1994, many successful agents use temporal difference learning today. But in order to be successful with temporal difference learning on game tasks, often a careful selection of features and a large number of training games is necessary. Even for board games of moderate complexity like Connect-4, we found in previous work that a very rich initial feature set and several millions of game plays are required. In this work we investigate different approaches of online-adaptable learning rates like Incremental Delta Bar Delta (IDBD) or Temporal Coherence Learning (TCL) whether they have the potential to speed up learning for such a complex task. We propose a new variant of TCL with geometric step size changes. We compare those algorithms with several other state-of-the-art learning rate adaptation algorithms and perform a case study on the sensitivity with respect to their meta parameters. We show that in this set of learning algorithms those with geometric step size changes outperform those other algorithms with constant step size changes. Algorithms with nonlinear output functions are slightly better than linear ones. Algorithms with geometric step size changes learn faster by a factor of 4 as compared to previously published results on the task Connect-4.

show abstract

“…As the dynamics of a robot can change due to many external factors ranging from temperature to wear, the learning process may never fully converge, i.e., it needs a "tracking solution" [Sutton et al, 2007]. Frequently, the environment settings during an earlier learning period cannot be reproduced and the external factors are not clear, e.g., how the light conditions affect the performance of the vision system and, as a result, the task's performance.…”

Section: Curse Of Real-world Samplesmentioning

confidence: 99%

Learning motor skills: from algorithms to robot experiments

Kober

Peters

2014

It - Information Technology

View full text Add to dashboard Cite

Die Veröffentlichung steht unter folgender Creative Commons Lizenz: Namensnennung -Keine kommerzielle Nutzung -Keine Bearbeitung 2.0 Deutschland http://creativecommons.org/licenses/by-nc-nd/2.0/de/ Abstract Ever since the word "robot" was introduced to the English language by KarelČapek's play "Rossum's Universal Robots" in 1921, robots have been expected to become part of our daily lives. In recent years, robots such as autonomous vacuum cleaners, lawn mowers, and window cleaners, as well as a huge number of toys have been made commercially available. However, a lot of additional research is required to turn robots into versatile household helpers and companions. One of the many challenges is that robots are still very specialized and cannot easily adapt to changing environments and requirements. Since the 1960s, scientists attempt to provide robots with more autonomy, adaptability, and intelligence. Research in this field is still very active but has shifted focus from reasoning based methods towards statistical machine learning. Both navigation (i.e., moving in unknown or changing environments) and motor control (i.e., coordinating movements to perform skilled actions) are important sub-tasks.In this thesis, we will discuss approaches that allow robots to learn motor skills. We mainly consider tasks that need to take into account the dynamic behavior of the robot and its environment, where a kinematic movement plan is not sufficient. The presented tasks correspond to sports and games but the presented techniques will also be applicable to more mundane household tasks. Motor skills can often be represented by motor primitives. Such motor primitives encode elemental motions which can be generalized, sequenced, and combined to achieve more complex tasks. For example, a forehand and a backhand could be seen as two different motor primitives of playing table tennis. We show how motor primitives can be employed to learn motor skills on three different levels. First, we discuss how a single motor skill, represented by a motor primitive, can be learned using reinforcement learning. Second, we show how such learned motor primitives can be generalized to new situations. Finally, we present first steps towards using motor primitives in a hierarchical setting and how several motor primitives can be combined to achieve more complex tasks.To date, there have been a number of successful applications of learning motor primitives employing imitation learning. However, many interesting motor learning problems are high-dimensional reinforcement learning problems which are often beyond the reach of current reinforcement learning methods. We review research on reinforcement learning applied to robotics and point out key challenges and important strategies to render reinforcement learning tractable. Based on these insights, we introduce novel learning approaches both for single and generalized motor skills.For learning single motor skills, we study parametrized policy search methods and introduce a framework of reward-weighted imi...

show abstract

On the role of tracking in stationary environments

Cited by 45 publications

References 10 publications

Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective

Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective

Online Adaptable Learning Rates for the Game Connect-4

Learning motor skills: from algorithms to robot experiments

Contact Info

Product

Resources

About