Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

Groth, Oliver; Wulfmeier, Markus; Vezzani, Giulia; Dasagi, Vibhavari; Hertweck, Tim; Hafner, Roland; Heess, Nicolas; Riedmiller, Martin

doi:10.48550/arxiv.2109.08603

Cited by 5 publications

(5 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of the first attempts to consider reinforcement learning agents with a sense of curiosity is [7,19], where curiosity is formulated as the error in an agent's ability to predict the consequence of its actions. The curiosity-powered agent learns how to interact with the environment by curiosity alone and able to learn skills to finish the game-play [11,12,17,20,25,28]. Authors of [10] address the problem of automatically exploring and testing 3D games using RL.…”

Section: Related Workmentioning

confidence: 99%

Towards Agent-Based Testing of 3D Games using Reinforcement Learning

Ferdous

Kifetew

Prandi

et al. 2022

Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

View full text Add to dashboard Cite

Computer game is a billion-dollar industry and is booming. Testing games has been recognized as a difficult task, which mainly relies on manual playing and scripting based testing. With the advances in technologies, computer games have become increasingly more interactive and complex, thus play-testing using human participants alone has become unfeasible. In recent days, play-testing of games via autonomous agents has shown great promise by accelerating and simplifying this process. Reinforcement Learning solutions have the potential of complementing current scripted and automated solutions by learning directly from playing the game without the need of human intervention. This paper presented an approach based on reinforcement learning for automated testing of 3D games. We make use of the notion of curiosity as a motivating factor to encourage an RL agent to explore its environment. The results from our exploratory study are promising and we have preliminary evidence that reinforcement learning can be adopted for automated testing of 3D games.

show abstract

Section: Related Workmentioning

confidence: 99%

Towards Agent-Based Testing of 3D Games using Reinforcement Learning

Ferdous

Kifetew

Prandi

et al. 2022

Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

View full text Add to dashboard Cite

show abstract

“…Despite traditional RL that the learning is driven by an extrinsic reward signal, intrinsically motivated RL concerns task-agnostic learning (Sontakke et al, 2021b,a). Similar to animals' babies (Touwen et al, 1992), the agent may undergo a developmental period in which it acquires reusable modular skills (Kaplan and Oudeyer, 2003;Weng et al, 2001;Tian et al, 2021), such as curiosity and confidence (Schmidhuber, 1991a;Kompella et al, 2017;Burda et al, 2018;Mirza et al, 2020;Groth et al, 2021;Huang et al, 2022). Another aspect of such general competence is the ability of the agent to remain safe during its learning and deployment period (Garcıa and Fernández, 2015).…”

Section: Related Workmentioning

confidence: 99%

Physical Derivatives: Computing policy gradients by physical forward-propagation

Mehrjou¹,

Soleymani²,

Bauer³

et al. 2022

Preprint

View full text Add to dashboard Cite

Model-free and model-based reinforcement learning are two ends of a spectrum. Learning a good policy without a dynamic model can be prohibitively expensive. Learning the dynamic model of a system can reduce the cost of learning the policy, but it can also introduce bias if it is not accurate. We propose a middle ground where instead of the transition model, the sensitivity of the trajectories with respect to the perturbation of the parameters is learned. This allows us to predict the local behavior of the physical system around a set of nominal policies without knowing the actual model. We assay our method on a custom-built physical robot in extensive experiments and show the feasibility of the approach in practice. We investigate potential challenges when applying our method to physical systems and propose solutions to each of them.

show abstract

“…However, predictive agents are susceptible to the "dark room problem," where agents minimize predictive errors by either reducing their activity to zero or staying in places where nothing happens [42]. Predictive agents that explore and act in environments usually need additional components to work, such as separate action selection modules [13,38] or curiosity drives [18]. [36].…”

Section: Introductionmentioning

confidence: 99%

Neuron-level Prediction and Noise can Implement Flexible Reward-Seeking Behavior

Li,

Brenner,

Boesky

et al. 2024

Preprint

View full text Add to dashboard Cite

We show that neural networks can implement reward-seeking behavior using only local predictive updates and internal noise. These networks are capable of autonomous interaction with an environment and can switch between explore and exploit behavior, which we show is governed by attractor dynamics. Networks can adapt to changes in their architectures, environments, or motor interfaces without any external control signals. When networks have a choice between different tasks, they can form preferences that depend on patterns of noise and initialization, and we show that these preferences can be biased by network architectures or by changing learning rates. Our algorithm presents a flexible, biologically plausible way of interacting with environments without requiring an explicit environmental reward function, allowing for behavior that is both highly adaptable and autonomous. Code is available at https://github.com/ccli3896/PaN.

show abstract

Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

Cited by 5 publications

References 29 publications

Towards Agent-Based Testing of 3D Games using Reinforcement Learning

Towards Agent-Based Testing of 3D Games using Reinforcement Learning

Physical Derivatives: Computing policy gradients by physical forward-propagation

Neuron-level Prediction and Noise can Implement Flexible Reward-Seeking Behavior

Contact Info

Product

Resources

About