Recurrent neural networks are powerful sequence learners. They are able to incorporate context information in a flexible way, and are robust to localised distortions of the input data. These properties make them well suited to sequence labelling, where input sequences are transcribed with streams of labels. Long short-term memory is an especially promising recurrent architecture, able to bridge long time delays between relevant input and output events, and thereby access long range context. The aim of this thesis is to advance the state-of-the-art in supervised sequence labelling with recurrent networks in general, and long short-term memory in particular. Its two main contributions are (1) a new type of output layer that allows recurrent networks to be trained directly for sequence labelling tasks where the alignment between the inputs and the labels is unknown, and (2) an extension of long short-term memory to multidimensional data, such as images and video sequences. Experimental results are presented on speech recognition, online and offline handwriting recognition, keyword spotting, image segmentation and image classification, demonstrating the advantages of advanced recurrent networks over other sequential algorithms, such as hidden Markov Models.ii