Bas R. Steunebrink scite author profile

Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs ( ≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

show abstract

First experiments with PowerPlay

Srivastava

Steunebrink

Schmidhuber

2013

Neural Networks

View full text Add to dashboard Cite

Like a scientist or a playing child, POWERPLAY (Schmidhuber, 2011) not only learns new skills to solve given problems, but also invents new interesting problems by itself. By design, it continually comes up with the fastest to find, initially novel, but eventually solvable tasks. It also continually simplifies or compresses or speeds up solutions to previous tasks. Here we describe first experiments with POWERPLAY. A self-delimiting recurrent neural network SLIM RNN (Schmidhuber, 2012) is used as a general computational problem solving architecture. Its connection weights can encode arbitrary, self-delimiting, halting or non-halting programs affecting both environment (through effectors) and internal states encoding abstractions of event sequences. Our POWERPLAY-driven SLIM RNN learns to become an increasingly general solver of self-invented problems, continually adding new problem solving procedures to its growing skill repertoire. Extending a recent conference paper (Srivastava, Steunebrink, Stollenga, & Schmidhuber, 2012), we identify interesting, emerging, developmental stages of our open-ended system. We also show how it automatically self-modularizes, frequently re-using code for previously invented skills, always trying to invent novel tasks that can be quickly validated because they do not require too many weight changes affecting too many previous tasks.

show abstract

A formal model of emotion triggers: an approach for BDI agents

2011

View full text Add to dashboard Cite

This paper formalizes part of a well-known psychological model of emotions. In particular, the logical structure underlying the conditions that trigger emotions are studied and then hierarchically organized. The insights gained therefrom are used to guide a formalization of emotion triggers, which proceeds in three stages. The first stage captures the conditions that trigger emotions in a semiformal way, i.e., without committing to an underlying formalism and semantics. The second stage captures the main psychological notions used in the emotion model in dynamic doxastic logic. The third stage introduces a BDI-based framework (belief-desire-intention) with achievement goals, which is used to firmly ground the preceding stages. The result is a formalization of emotion triggers for BDI agents with achievement goals. The idea of proceeding in these stages is to provide different levels of commitment to formalisms, so that it remains relatively easy to extend or replace the used formalisms without having to start from scratch. Finally, we show that the formalization renders properties of emotions that are in line with the psychological model on which it is based.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.